PDF hosted at the Radboud Repository of the Radboud University Nijmegen

The following full text is a publisher's version.

For additional information about this publication click this link. http://hdl.handle.net/2066/107980

Please be advised that this information was generated on 2021-09-29 and may be subject to change. Gene Identification in intellectual disability

Janneke H.M. Schuurs-Hoeijmakers IGMD Gen® identifseatson in intelleotual dïsabüty

Janneke H.M. Schuurs-Hoeijmakers Cover design and layout by In Zicht Grafisch Ontwerp, Arnhem Printed and bound by Ipskamp Drukkers, Enschede

ISBN 978-90-9027086-9

Copyright © 2012 J.H.M. Schuurs-Hoeijmakers

All rights reserved. No parts of this publication may be reproduced, stored in a retrieval system of any nature, or transmitted in any form or by means, electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the holder of the copyright. Gen© identifioation in intellectual disability

Proefschrift

ter verkrijging van de graad van doctor aan de Radboud Universiteit Nijmegen op gezag van de rector magnificus prof. mr. S.C.J.J. Kortmann volgens besluit van het college van decanen in het openbaar te verdedigen op donderdag 20 december 2012 om 13u00 precies

door

Johanna Hendrica Maria Schuurs-Hoeijmakers geboren op 26 maart 1982 te Zevenhuizen (ZH), Nederland Promotoren Prof.dr H.G. Brunner Prof.dr. H. van Bokhoven

Copromotoren Dr. B.B.A. de Vries Dr. A.P.M. de Brouwer

Manuscriptcommissie Prof.dr. M.A.A.P. Willemsen (voorzitter) Prof.dr. M.H. Breuning (Leids Universitair Medisch Centrum, Leiden) Prof.dr. M. Nordenkjöld (Karolinska Instituut, Stockholm, Zweden)

Paranimfen Wieteke Faber-Hoeijmakers Anneke Vulto-van Silfhout Table of contents

Chapter 1 Intellectual disability: a general introduction 7

Chapter 2 Homozygosity mapping in outbred families with 33 Mental Retardation

Chapter 3 Identification of recessive pathogenic alleles in small 47 sibling families with intellectual disability

Chapter 4 in DDHD2, encoding an intracellular 79 phospholipase A-i, cause a recessive form of complex Hereditary Spastic Paraplegia

Chapter 5 Recurrent de novo mutations in PACS1 cause defective 105 cranial neural crest migration and define a recognizable intellectual disability syndrome

Chapter 6 Discussion and future perpective 121

Summary 139 Samenvatting 143

Reference List 147 Dankwoord | Acknowledgements 161 Curriculum Vitae 167

List of publications 169 List of abbreviations 173 Colorfigures 177 Thesis series of the Institute for Genetic and Metabolic Disease 207

Table of contents

Chapter 1 Intellectual disability: a general introduction 7

Chapter 2 Homozygosity mapping in outbred families with 33 Mental Retardation

Chapter 3 Identification of recessive pathogenic alleles in small 47 sibling families with intellectual disability

Chapter 4 Mutations in DDHD2, encoding an intracellular 79 phospholipase A-i, cause a recessive form of complex Hereditary Spastic Paraplegia

Chapter 5 Recurrent de novo mutations in PACS1 cause defective 105 cranial neural crest migration and define a recognizable intellectual disability syndrome

Chapter 6 Discussion and future perpective 121

Summary 139 Samenvatting 143 Reference List 147 Dankwoord | Acknowledgements 161 Curriculum Vitae 167

List of publications 169 List of abbreviations 173 Colorfigures 177 Thesis series of the Institute for Genetic and Metabolic Disease 207

Intellectual disability: a general introduction C H A P T E R 1

1.1 Definition 9 1.2 Prevalence 11 1.3 Inherited ID and etiology 12 1.4 Techniques fordetection ofgenetic causes oflD: a historica! overview 13 1.4.1 Karyotyping 13 1.4.2 Targeted chromosome analysis 14 1.4.3 Array CGH: emerging genomic disorders 15 1.4.4 Array CGH: gene discovery 16 1.4.5 Gene discovery by linkage studies 18 1.4.6 Massive parallel sequencing 19 1.4.6.1 Massive parallel sequencing: a revolution for hu man 19 1.4.6.2 The coding sequences of the genome: the exome 19 1.4.6.3 Gene discovery in ID syndromes by use of exome sequencing 22 1.4.6.4 Massive parallel sequencing for autosomal gene Identification 23 in non-syndromic ID 1.4.6.5 Massive parallel sequencing for recessive gene 'Identification 26 1.4.6.6 Exome sequencing in diagnostics 27 1.5 Scope and outline ofthis thesis 28 1.5.1 Why do we study genetics of ID? 28 1.5.2 Aims 28 1.5.3 How did we study genetics underlying ID? 28 INTELLECTUAL DISABILITY: A GENERAL iNTRODUCTION

intellectual disability: a genera! introduction

1.1 Definition

The brain is one of the most complex and intriguing human organs. lts fine-tuned development, cellular connectivity and plasticity are crucial for normal functioning. The outcome of aberrant brain development and/or aberrant interaction between brain cells can clinically present as intellectual disability (ID). ID comprises a group of clinically and etiologically heterogeneous cognitive disorders that is primarily characterized by defects and subsequent malfunctioning of the brain. Clinicians often make a subdivision of syndromic or specific versus non-syndromic or non- specific forms depending on the presence or absence, respectively of additional clinical features accompanying ID, such as dysmorphic, metabolic, neuromuscular or psychiatrie features and congenital malformations1. Terminology to describe individuals with ID has been changed frequently. Names used in the past for those with ID include idiot, imbecile, feeble-minded, mentally handicapped and mentally retarded2. Over the last few years there has again been a change in terminology brought about by President Obama signing Rosa’s law on the 5th of October 2009, thereby replacing several instances of ‘mental retardation’ in law with ‘intellectual disability'2 (Box 1).

Box 1 The introduction of Rosa's Law replaces mental retardation for intellectual disability

S. 2781 (111th): Rosa’s Law 111,h Congress, 2009-2010 A bill to change references in Federal law to mental retardation to references to an intellectual disability, and to change references to a mentally retarded individual to references to an individual with an intellectual disability.

Introduced: November 17th, 2009 Sponsor: Sen. Barbara Mikulski [D-MD] Status: Signed by the President, October 5lh, 2009

9 C H A P T E R 1

For formal clinical diagnosis and classification of ID, mainly three systems are in use: the International Statistical Classification of Diseases and Related Health Problems (ICD-10)3from the World Health Organization, the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR)4 from the American Psychiatrie Association and the definition of the American Association on Intellectual and Developmental Disabilities (11th edition, AAIDD)5. All definitions state a substantial impairment in both cognitive functioning and adaptive behavior, originating in the developmental period (defined as before 18 years of age). Cognitive or intellectual functioning refers to general mental capacities, such as learning, reasoning and problem solving, and can be measured by intelligence quotiënt (IQ)-testing, where 100 IQ-points is the mean and an IQ-score <70 is indicative of substantial impairment in intellectual functioning, Several tests have been developed for the calculation of intelligence (reviewed by Deary et al6.). One of the most widely used tests for IQ assessment was devised by Wechsler, comprising preschool, child and adult versions. IQ measurements give an indication of severity of ID, frequently subdivided into four categories: mild (IQ 50-70), moderate (IQ 35-50), severe (IQ 20-35) and profound ID (IQ <20) (Table 1). The AAIDD does not subdivide ID based on IQ-score and rather concentrates on the nature of needed support: intermittent,

Table 1 Subdivision of severity of ID based on IQ measurement

DSM-IV-TR classification ICD-10 classification

Terminology code IQ level code IQ level

mild 317 50-55-to 70 F70 50-69

moderate 318.0 35-40 to 50-55 F71 35-49

severe 318.1 20-25 to 35-40 F72 20-34

profound 318.2 <20-25 F73 <20

Nole- DSM-IV-TR takes into account a measurement error of (ive points. as this is inherent to intelligence testing

limited, extensive or pervasive. Adaptive behavior consists of (i) conceptual skills, such as language and literacy and self-direction, (ii) social skills, such as inter- personal skills, social responsibility, self-esteem, social problem solving and the ability to follow rules, and (iii) practical skills, such as personal care and occupational skills. In short, for clinicians, the combined presentation of impairment in (one of) these areas of adaptive behavior and cognitive impairment are the clinical hallmarks on which the diagnosis of ID is based.

1 0 INTELLECTUAL DISABILITY; A GENERAL INTRODUCTION

1.2 PrevaSence

Intelligence of the (healthy) population follows a normal or Gaussian distribution, with an IQ approximately in between 50 to 150 (Figure 1), indicating that an IQ < 70 (<-2SD) will be observed as normal variation within the population. However, already observed and described in 1967 by Zigler7, this is not a completely representative image of the empirical distribution of intelligence since people with ID, where a physiological defect is suspected to underly the ID phenotype, fall for most part outside the range of normal intelligence and represent a second distribution curve in addition to the Gaussian curve with a considerably lower mean IQ (Figure 1)7. An appropriate representation of intelligence distribution encompasses two distribution curves, a normal distribution curve with a mean of 100 IQ points for the 'healthy' population, representing polygenic or multifactorial intelligence distribution. And a second curve with an mean around 35, representing a group of people with ID due to pathophysiological defects, such as perinatal and monogenetic causes. To estimate the prevalence of ID, intelligence and its distribution is only one of the factors that needs to be considered. Determining the prevalence of ID is not straightforward as several factors influencing estimation and interpretation need to be taken into account1-8, not in the least the above mentioned different definition systems that are in use to diagnose individuals with ID. The variation in key criteria to define and diagnose ID may influence prevalence estimates and complicates comparison between different

Figure 1 Schematic representation of the distribution of IQ scores within the population

IQ-score

The IQ distribution of the ‘normal’ population has a shape of a Gaussian curve, with the mean IQ set at 100 IQ points. IQ scores below 70 are observed in —2.5% of the population. The second curve with the mean around 35 IQ points represents individuals with a pathophysiological cause underlying low IQ measures. Adapted from Zigler et al. 1967.

11 CHAPTER 1

studies. Besides, several other factors affect prevalence amongst which the type of data source from which estimates are derived. Ideally, cases should be ascertained from an entire population or a representative of such and not only be limited to those individuals that receive special services or who live in institutions. Nevertheless, the latter are usually better recorded and easier to access. Therefore, the ‘ascertained prevalence’, i.e. the number of cases recorded by authorities, may not bea complete representative for the 'true prevalence’, i.e. the total number of intellectually disabled persons in a population. In order to closer resemble ‘true prevalence’ rates multiple administration data sources need to be combined1’ 9. Also, the selection of population age and gender groups that are at risk (children, adults, elderly, total population; male versus female) will result in different estimates of prevalence. A peak in prevalence rate for ID is seen at age 10-14 years, demonstrating the impact of the educational system on identification of individuals with ID1, whereas a higher prevalence rate of ID is reported amongst males as compared to females (1.3-1.4 times higher), at least partly attributed to X-linked conditions10’13. ID prevalence rates, especially for mild ID, are also influenced by social, economical, cultural, ethnic and other environmental factors. Biotechnological advances, such as prenatal diagnosis and a consequently increase in abortion rate, and treatment of several congenital disorders such as phenylketonuria and congenital hypothyroidism also influence prevalence rates. The most comprehensive study of ID prevalence was published in 2002 by Leonard and Wen1. They give an estimate of the prevalence of ID in the general population in Western countries of 0.3-0.5% for severe ID and 1-3% for the total group of ID, indicating that ID is a common disorder.

1.3 Inherited ID and etiology

ID can follow all imaginable inheritance patterns from monogenic autosomal dominant (mainly caused by de novo mutations, but this group also contains imprinting defects, mosaic and fully or partially penetrant mutations), autosomal recessive and X-linked inheritance to mitochondrial, di- to polygenic and multi- factorial inheritance. These different inheritance patterns request for different approaches for gene identification (discussed in 1.4). Mutations in more than 450 genes annotated in the Online Medelian Inheritance in Man database (OMIM: http://omim.org/)14'15 have ID as phenotypic feature, of which 22 % cause dominant forms of ID, 56 % autosomal recessive and 22 % X-linked forms (Figure 2). Most of these known genes (-400) cause syndromic ID phenotypes. ID can be classified in three etiological categories: genetic, non-genetic and unknown causes. Genetic causes can be divided into a group of individuals with a known underlying molecular defect and a group were the molecular defect is

12 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

not known. The non-genetic group can roughly be subdivided based on the timing during development of the ID causing event: prenatally, perinatally and postnatally derived ID16. On the genetic level the defects leading to ID encompass a plethora of defects ranging from whole chromosomal aneuploidies and mutli- ploidies, chromosomal rearrangements such as translocations, (micro) and -duplications, to imprinting disorders and nucleotide repeat expansions up to single base pair changes. More than 537 ID phenotypes with a known molecular basis are annotated in the Online Mendelian Inheritance in Man database [OMIM, July 2012]. Even with this amount of genetic defects known to lead to ID, about half of the individuals with a suspected genetic cause underlying ID do not receive a molecular diagnosis17-18.

Figure 2 Distribution of inheritance mode of 472 human genes involved in ID phenotypes

Genes are derived from the Online Mendelian Inheritance in Man14 (July 2012) and have ID reported as phenotypic feature in the ‘clinical synopsis’ or 'clinical features' or 'description' section, respectively.

1.4 Techniques for detection of genetic causes of ID: a historica! overview

1.4.1 Karyotyping ID syndromes have been recognized by physicians for ages, but it was not until 1959 that an extra chromosome, later on recognized as a trisomy of chromosome 21, was identified as the genetic cause of Down syndrome19-20 which is the most frequent cause of ID with an incidence of 1/700-1,000 live births 21-22. Identification of other autosomal aneuploidies followed, such as trisomy 13 (Edwards-syndrome)23

13 C H A P T E R 1

and 18 (Patau-syndrome)24, and sex chromosomal aneuploidies, such as monosomy X (Turner-syndrome)25’ 26 , 47.XXY (Klinefelter-syndrome)27' 28. From the 1970s on conventional karyotyping by use of Giemsa staining of condensed chromosomes became available as a diagnostic test, providing means to detect chromosomal aberrations with a resolution of up to 5 Mb. This lead to identification of several clinically recognizable chromosome aberration syndromes, such as 5p- syndrome (Cri-du-chat-syndrome)29 and 4p- syndrome (Wolf-Hirschhorn-syndrome)30' 31. The fragile site close to the telomere end of the long arm of chromosome X, best visualized when using special culture techniques for leucocytes such as folate and thymidine depletion, was discovered in individuals with Martin-Bell syndrome32'36, now mostly known as . The underlying molecular defect, an unstable expanded CCG-n trinucleotide repeat sequence in the 5-prime untranslated region of FMR1, was discovered in 1991 as the basis of Fragile X syndrome37. Fragile X syndrome is the most common heritable cause of ID, affecting approximately 1 in 3,500-6,000 males18’ 38’ 39, and the first disorder in which a dynamic was identified. Microscopically detectable chromosome abnormalities are found in 5-10% of individuals with ID18>40’41 (Figure 3). Since its implementation in the seventies, karyotyping remained the major diagnostic test for over 30 years.

1.4.2 Targeted chromosome analysis In the years following the introduction of karyotyping in clinical practice, targeted approaches such as fluorescent in situ hybridization (FISH) karyotyping, quantitative polymerase chain reaction (QPCR) and multiplex ligation dependent probe amplification (MLPA) came into use. Breakpoint mapping was used in addition in ID research to identify disruption of genes. This lead to identification of interstitial submicroscopic and subtelomeric deletion syndromes and occasionally single gene disruptions, thereby broadening the diagnostic spectrum of tests available for ID syndromes. Examples of clinically recognizable interstitial microdeletion syndromes are Williams-Beuren syndrome (deletion 7q11.23)42'44, Angelman syndome (maternal deletion 15q11-13)45- 46, Prader-Willi syndrome (paternal deletion 15q11 -13)47' 48 and DiGeorge syndrome (deletion 22q11)49- 50. The more prevalent subtelomeric deletion syndromes are 1p36, 1q44, 2q23, 9q34 (nowadays known as Kleefstra syndrome) and 22q13 deletion syndromes (reviewed by de

Vries et al,51 and by van Bon52). Tanslocation breakpoint mapping led to the identification of the causative gene, EHMT1, an euchromatin histone methyltransfer- ase in Kleefstra syndrome 53. Subsequent Sanger sequencing, revealed point mutations further confirming EHMT1 as the gene responsible for the core Kleefstra syndrome features53' 54. By determining the critical region of the 22q13.3 deletion syndrome, SHANK3 was identified as the gene responsible for at least part of the

14 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

Figure 3 Increasing resolution of techniques applied for genome wide investigation of individuals with intellectual disability and their approximate cumulative diagnostic yield (in percentage)

G-banded Array CGH Exome karyotyping sequencing )) /) ii Ii ff li Ii I) ii ü U

ül 5; »i U :i «a \ . .,‘ |f

5-10 Mb 20 kb 1 Single nucleotide level 1970 2000 2010 2012 10-15%* 30% 65%„

*This percentage includes targeted techniques such as FISH karyotyping and subtelomeric MLPA. ** For exome sequencing, recent studies indicate a yield of up to -65% when applied as first diagnostic test, depending on inheritance model studied106-11°'115' 134' 157' 160 phenotype55’66. Adding the above mentioned techniques to the diagnostics toolset for ID lead to a doubling of the diagnostic yield from 5-10% for karyotyping alone, to 10-15% if combined with FISH karyotyping and subtelomeric MLPA51’57-59.

1.4.3 Array CGH: emerging genomic disorders At the turn of the century array comparative genomic hybridization (CGH) made its introduction and changed the nature of human genome analysis by combining the high resolution, targeted approach of FISH with the genome wide approach of karyotyping. The first widely used array platforms (bacterial-, P1-derived- and yeast artificial chromosome-based arrays) had a genome-wide resolution of one Mb at best 60-62 and these were quickly replaced by full tiling arrays with approximately 30,000 clones and a resolution of one clone every 100 kb63 67. Over the last decade, there has been an enormous increase and improvement in both the resolution and evenly spacing of SNP and/or oligo probes across the genome, resulting in array platforms that can detect copy number variations (CNVs) up to exon level today (>2 million CNV markers). Array CGH offered the possibility to interrogate the whole genome on copy number level in an unbiased fashion and allowed for the detection of causative submicroscopic recurrent and non-recurrent deletions and duplications (reviewed by van Bon62), also designated as copy number variations (CNVs), with or without emerging recognizable clinical phenotypes. Recurrent microdeletion and -duplication syndromes are mediated by low copy repeat (LCR) regions and therefore these aberrations always have the same size in unrelated individuals,

15 C H A P T E R 1

whilst in non-recurrent microdeletion and -duplication syndromes the aberration size varies between individuals. Examples of such recurrent submicroscopic deletions and duplications are 1q2168, 7q11.2369, 10q22q2370' 71, 15q13.372-74, 16p12.2p11.275' 76, 17p11.2 (Smith-Magenis syndrome77' 78, Potocki-Lupski syndrome79' 80) and 17q21.3181> 82 microdeletion/ duplication syndromes. Microdeletions of 2q 2383 , 8q21.1184, 18q21.1 (Pitt-Hopkins syndrome85' 86), 19q13.1187 89 and 22q13.390' 91 are examples of non-recurrent submicroscopic deletion syndromes. Studying the healthy population for genome wide CNVs showed the immense inter-individual variation, as every individual carries >1000 (mostly) apparently benign copy number variations within his or her genome92. CNVs are now considered as a common form of structural genomic variation. The array era’ has allowed for the identification of numerous recurrent and non-recurrent microdeletion syndromes (reviewed by van Bon52) and has proven its use in the diagnostics of ID where it can mostly replace G-banded karyotyping. Array CGH has a detection rate of —30% causative CNVs in individuals with ID where a genetic cause is suspected63’65' 93‘96 (Figure 3).

1.4.4 Array CGH: gene discovery Array CGH allowed, apart from molecularly and clinically characterizing micro­ deletion and -duplication syndromes, also for defining and accurately mapping of critical regions encompassing only several candidate genes or even single genes that could explain the major phenotypic features. This made candidate gene selection easier than before, and in this way array CGH contributed to gene discovery. An example where the critical region was significantly narrowed down is chromosome 19q13.11 . The smallest region of overlapping deletions in affected individuals with the core phenotypic features was refined from 2.9 Mb to 324 kb by mapping of endpoints of deletions in only seven individuals88' 97'98. This reduced the number of candidate genes for the core phenotypic features from 35 to five genes98. In other microdeletion syndromes the major contributing gene was eventually identified by sanger sequencing of candidate genes in the minimal overlapping region in patients with the typical syndromic features, but without a causal CNV (for summary see Table 2). One of the early examples where array CHG was applied in gene discovery is CHARGE syndrome, where deletions in two individuals showed an overlap of —2.3Mb. Sanger sequencing of all nine genes within this interval in individuals diagnosed with CHARGE syndrome led to identification of point mutations in CHD799 in ten individuals. One of the latest examples of gene identification by this approach is the identification of KANSL1, an evolutionary conserved regulator of chromatin modification, as cause of the phenotypic features of 17q21.31 microdeletion syndrome100' 101. Both Kooien et a/.100 and Zollino et a/.101 revealed protein truncating point mutations in KANSL1,

16 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

Table 2 Microdeletion syndromes and genes responsible for their ID phenotype

Syndrom e Chromosomal Gene(s) MIM Reference location involved p henotype num ber

Mowat Wilson 2q22 ZEB2 235730 Wakamatsu et al. 2001163

2q23.1 microdeletion 2q23.1 MBD5 156200 Wagenstaller. et al. 2007164

2q37 microdeletion 2q37 HDAC4 600430 Williams et al. 2010165

5q14.3 5q14.3 MEF2C 613443 Le Meur. et al. 2010166

Sotos syndrome 5q35 NSD1 117550 Imaizumi et ai, 2002167

8q21.11 microdeletion 8q21.11 ZFHX4 614230 M anuscript in preparation

Kleefstra 9q34.3 EHMT1 610253 Kleefstra et al. 20 0654

10q23 microdeletion 10q23 PTEN 158350 Liaw et al. 1997168

Angelman 15q11.2q13 UBE3A 105830 Kishino et al. 1997169

Rubinstein-Taybi 16p13.3 CREBBP 180849 Petrij. et al. 1995170

Smith-Magenis 17p11.2 RAI1 182290 Slager et at, 2003171

Koolen-de Vries 17q21.31 KANSL1 610443 Kooien et al. 201255>100/ Zollino et al. 2012101

Pitt-Hopkins 18q21,1 TCF4 610954 Zweier et al. 200786/ Amiel et al. 200785

Phelan-McDermid 22q13.3 SHANK3 606232 Bonaglia et al. 200155

Pelizaeus-Merzbacher Xp22 PLP1 312080 Willardefa/. 1985172

each in two patients. For chromosome 8q21.11 microdeletion syndrome the smallest region of overlap in eight patients in the original paper was narrowed down to 540 kb, encompassing only three genes84. Detection of a single gene deletion disrupting ZFHX4 in an individual with characteristics similar to those observed in chromosome 8q21.11 microdeletion syndrome, suggests that haploinsufficiency of this gene is responsible for the recognizable phenotype (manuscript in preparation). However, in the majority of microdeletion and duplication syndromes, no gene- specific mutations have been found so far and the phenotype is likely the result of a contiguous gene syndrome, where the dosage effect of several genes, maybe even working together in a synergistic way, gives rise to the typical clinical features.

17 CHAPTER 1

1.4.5 Gene discovery by linkage studies In familial cases of ID the Identification of genetic defects is usually preceded by determination of the genomic position of the defect using linkage analysis or homozygosity mapping. Linkage analysis is a tooi to localize the genetic defect based on analysis of highly polymorphic markers in the human genome that should segregate with the disease allele. It turned out to be a valuable technique to investigate and locate the genomic region harboring the genetic defect in families with a X-linked pedigree. Homozygosity mapping makes use of markers to identify homozygous genomic regions where the genetic defect is localized. In families with an autosomal recessive inheritance pattern homozygosity mapping has been focused on individuals from large consanguineous families that often originate from geographical regions where consanguineous marriages are common102. Localization of the genomic region that is expected to harbor the disease allele by either of these strategies is usually followed by a candidate gene approach, meaning that genes in the linkage interval or homozygous region are ranked upon their characteristics and probability of being involved in neurodevelopmental disease, followed by Sanger sequencing of the best candidate genes to reveal possible mutations. For X-linked ID linkage analysis turned out to be a fruitful strategy. This resulted in the identification of >80 X-linked ID genes of which 30 give rise to non-syndromic ID and 51 to syndromic forms, although the line between syndromic and non-syndromic ID genes is blurred since several genes give a wide range of phenotypic features both syndromic and non-syndromic (reviewed by Gècz et a l)2 and Ropers103' 104). FMR1, FMR2, MECP2 and ARX are the most frequently mutated genes. The successful achievements on chromosome X were made possible largely by international collaborative efforts of the EURO-MRX105 (www.euromrx. com) and IGOLD (http://goldstudy.cimr.cam.ac.uk/) consortia. Families collected by these consortia were studied to identify causative gene mutations using positional cloning and candidate gene approaches. More recently, large scale sequencing efforts have been successfully applied to these cohorts. Nine X-linked ID genes were identified by a huge Sanger sequencing effort by Tarpey et a/.106, who performed a large-scale study and resequenced all coding exons of the X-chromosome in 208 ID families with an X-linked inheritance mode. By doing so, they identified mutations AP1S2, BRWD3, CASK, CUL4B, SLC9A6, SYP, UPF3B, ZDHHC9, ZNF711 predominantly in families with a non-syndromic ID phenotype. Truncating mutations in nineteen genes were identified only in single families representing novel candidate genes for ID. Inactivating mutations accounted for ID in 11% of the families and the total yield of apparently causative mutations in this study was 25%. Another large scale study in X-linked ID was conducted by Kalscheuer et al. {manuscript submitted). In this study massive parallel sequencing

18 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

of all coding exons of the X-chromosome was performed in probands of almost 250 X-linked families. This revealed mutations in another 14 novel X-chromosomal genes that are associated with ID and brings the total of X-chromosomal ID genes to —110 today. The authors note that mutations in these currently known —110 genes account for up to 70% of families with X-linked ID. Compared to X-linked ID, gene identification in non-syndromic autosomal recessive ID has been lagging behind. Until 2006 homozygosity mapping revealed only three disease loei and genes in non-syndromic recessive ID, PRSS12, CRBN and CC2D1A. Another eight loei were identified by Najmabadi ef a/.107 in a systematic SNP-array based homozygosity analysis in 78 consanguineous Iranian families. None of the 78 families had overlapping homozygous loei with the three previously identified genes, suggesting that there is not a gene with a major contribution to autosomal recessive ID, but rather that this condition is genetically heterogeneous. In the search for the genes involved in autosomal recessive ID, studies have initially been focused on consanguineous families. One of the downsides of studying this population is that these consanguineous families are rare in the Western world. Despite all efforts, before massive parallel sequencing (see 1.4.6) made its introduction, only nine genes for non-syndromic autosomal recessive ID had been identified (reviewed by Ropers13’ 108).

1.4.6 Massive parallel sequencing

1.4.6.1 Massive parallel sequencing: a revolution for human genetics Massive parallel sequencing (MPS) made its first introduction around 2005 and whereas array CGH already greatly impacted genetic research by its high resolution on a genome-wide scale, MPS certainly revolutionized human genetics and the studies of mendelian disorders, including ID109'115. MPS, also known as next generation sequencing or second generation sequencing, encompasses several techniques for high-throughput DNA sequencing by sequencing millions of short read DNA fragments (50-500 bases, Figure 4). Several MPS approaches have been used to discover mutations responsible for ID phenotypes: X-exome sequencing in X-linked ID, intersection filtering of exome variants in syndromic ID, denovo analysis of exome variants in sporadic non-syndromic ID, and for autosomal recessive ID targeted sequencing of homozygous loei and recessive analysis of exome variants has been applied (Figure 5).

1.4.6.2 The coding sequences of the genome: the exome Exome sequencing is now widely used as an unbiased genome wide approach to search for disease causing rare variants. Exome sequencing relies on capturing, sequencing and analyzing the —1% of the genome that contains protein coding

19 CHARTER 1

Figure 4 Schematic representalion of massive parallel sequencing principles based on library preparation using array-based and in-solution capture of target DNA

Genomic DNA

Fragmentation End repair Adapter ligation

Array-based capturing In-solution capturing

Multi-plexing

Enriched library | I

Massive parallel sequencino First, genomic DNA is isolated and sheared in fragments, sequencing adapters for multiplexing are attached. Selection of target sequences is achieved by hybridization against target-specific oligonucleotide micro- Read mapping, alignment, variant calling arrays or in-solution capture probes, .TT,— T . iT T jr s a r ' the latter contain magnetic beads and are retained by a magnetic field. Wash- ing removes unbound fragments and the generated library is multiplexed by PCR reaction and fragments are sequenced simultaneously (massive 'parallel' sequencing). Alignment of the sequence reads to the human reference genome and variant calling allow for variant prioritization and inter­

Variant prioritization pretation. Adapted from Haas e t al. and biological interpretation 2011161.

2 0 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

Figure 5 Massive parallel sequencing (MPS) approaches to identify mutations causing ID

Targetted MPS

B U t O D r Ó Ó É É It II. I IN I I 1 ó ó è ü è É ü É i o ri= ü r / *Homozygous region

Genome wide MPS

D rO DTO nrODrO nrO B tO

o ■ o ■ é li !l H \ ^ i r ^ Two aileles Family-based Intersection filtering exome in the same gene

de novo variants Shared variants Recessive variants

(A) Homozygosity mapping followed by targeted MPS of the homozygous linkage intervals, as described by Najmabadi et a/.110. (B) X-chromosome exome sequencing: sequencing of all coding exons of the X-chromosome in families with an apparent X-linked inheritance pattern, followed by segregation testing. This approach was taken by Kalschauer et a/.160. (C) Family-based exome sequencing to identify de novo rare variants in sporadic cases with non-syndromic ID/ autism, applied on a large scale by several studies 115, 131-134 (□) Exome sequencing in syndromic ID to identify rare variants in the same gene, genes with a similar function or functioning in the same protein complex. This approach lead to identification of many dominant syndromic ID genes109' 111> 113' ™' 123- 162 (E) Exome sequencing to identify rare potentially recessive variants (homozygous and compound heterozygous) in siblings with (non)-syndromic ID. Applied in a systematic screen of twenty sibling families by Schuurs-Hoeijmakers et al,157’ 158.

21 CHAPTER 1

information. lts first application on 12 HapMap individuals (2009)116 describing the immense variation between healthy individuals was quickly followed by the first report of gene identification by exome sequencing in a Mendelian disorder (January 2010)117, where mutations in DHODH were identified as the cause of Miller syndrome, also known as postaxial acrofacial dysostosis. In the past two years exome sequencing has identified genes for > 70 Mendelian conditions, of which > 40 have ID as a phenotypic feature, and is continuing to contribute to disease gene discovery at a still accelerating pace118 (Table 3, Figure 6).

1.4.6.3 Gene discovery in ID syndromes by use of exome sequencing Exome sequencing in ID research was first applied in sporadic, syndromic conditions involving ID. The general exome strategy for disease gene identification in such sporadic syndromes is to scan for genes harboring potentially disruptive mutations in all or most of the patients investigated by exome sequencing, a method called 'intersection filtering’119(Table 3). The first studies successfully applying this strategy identified SETBP1 mutations that clustered in a highly conserved 11 bp region of the gene, as causing Schinzel-Giedion syndrome109, mutations in ASXL1 as causing Bohring-Opitz syndrome120 and MLL2 as the predominant gene causing Kabuki syndrome111. Activating mutations in ABCC9, previously associated with idiopathic dilated cardiomyopathy type 10121, were identified in twenty-six individuals with Cantü syndrome114- 122. Also mutations were discovered within genes that function in the same pathway or even the same protein complex and that give a similar clinical presentation when disrupted. In Baraitser-Winter syndrome, a variation of the above mentioned approach was taken and exome sequencing was performed on three proband-parent trios with typical Baraitser-Winter syndrome presentation, aiming for de novo mutation detecting in the same gene. De novo missense mutations in two genes ACTB and ACTG1, encoding cytoplasmic - and -actin proteins, explained the clinical presentation in the three probands. Sanger sequencing in 15 more affected individuals identified mutations in either of the two genes in all 15 individuals112. In Kleefstra syndrome a combination of targeted MPS and exome sequencing in individuals with the typical syndromic presentation but without mutations in the canonical Kleefstra syndrome gene (EHMT1) revealed de novo mutations in four different genes: SMARCB1, MBD5, NR1I3 and MLL3 which like EHMT1 have a role in epigenetic regulation of transcription162. De novo mutations in several of these genes, including EHMT1 and MLL3, were also identified in patients with autism spectrum disorder, schizophrenia and other neurodevelopmental disorders by others124'126. Exome sequencing of five individuals with typical Coffin-Siris syndrome also revealed de novo SMARCB1 mutations in two individuals. SMARCB1 encodes a subunit of the SWItch/Sucrose NonFermenting (SWI/SNF) complex. The authors screened 15 genes encoding subunits of this complex in 23

2 2 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

individuals with Coffin-Siris syndrome. Germline mutations were detected in 20 of 23 individuals in six SWI/SNF subunit genes: SMARCB1, SMARCA4, SMARCA2, SMARCE1, ARID1A and ARID1B 113' 127. As demonstrated in the above examples, exome sequencing by the intersection filtering approach has proven its value for well known recognizable ID syndromes as well as for other neurodevelopmental and psychiatrie conditions. These studies in different neurodevelopmental and psychiatrie conditions have revealed that the molecular underpinnings of these conditions are strongly overlapping.

1.4.6.4 Massive parallel sequencing for autosomal gene identification in non-syndrom ic ID Most individuals with ID do not reproduce and therefore present as sporadic cases, frequently without a clearly recognizable clinical phenotype. Discovering the underlying genetic cause for this non-syndromic group had been extremely challenging as it is almost impossible to make an a priori selection of patients with the same underlying genetic etiology. Therefore genome wide CNV analysis by array CGH was the best applicable test to search for mostly de novo CNVs, thereby leaving the majority of patients without a molecular diagnosis. As de novo CNVs account for —15% of ID and the human per-generation mutation rate is relatively high128'130, Vissers et a/.115 applied exome sequencing to test the hypothesis that de novo basepair mutations contribute to ID pathogenesis in sporadic cases. Vissers et al. sequenced the exomes of ten case-parent trios with isolated severe non-syndromic ID aiming for mutations in exons or splice-sites present in the child but in none of the parents. De novo causative mutations were identified in SYNGAP1 and RAB39B, both already previously implicated in ID and the latter located on the X-chromosome. In four other genes (DYNC1H1, YY1, DEAF1 and CIC) de novo mutations were identified that were in sillco predicted to be disruptive to gene function. All these genes fulfill essential functions in nervous system development and were considered good candidates to explain the clinical phenotype. In one of the male probands, a maternally inherited X-chromosomal mutation in JARID1C was identified, underscoring the important contribution of X-chromosomal genes to ID in (isolated) male individuals. These results underlined the high heterogeneity in ID and explains how a disorder with reduced reproductive fecundity can be a common disorder. Studies in larger cohorts of individuals (hundred and more) with isolated ID and autism, a disorder that clinically and genetically shows overlap with ID, followed and showed a similar pattern of results with highly (potentially) disruptive de novo mutations overrepresented in case-parent trios as compared to control-parent trios123' 126' 131-134

23 Table 3 ID disease gene Identification by exome sequencing (until July 2012)

Disorder MIM Inher Location Gene Numb phenotype of number exomi 2010 Hyperphosphatasia with ID 239300 AR 1 p36.11 PIGV 3 3MC syndrome 257920 AR 3q27.3 MASP1 2 Kabuki syndrome 147920 AD 12q13.12 MLL2 10 Schinzel-Giedion syndrome 269150 AD 18q21.1 SETBP1 4 Primary AR microcephaly with or without cortical 604317 AR 19q13.12 WDR62 3 malformations 2011 with psychomotor retardation 614229 AR 1q32.2 SYT14 2 Ichtyosis, spastic quadriplegia and ID 614457 AR 6q14 ELOVL4 1 Immunodeficiency, centromeric instability and facial 614069 AR 6q21 ZBTB24 1 anomalies Complex biiaterai occipitai cortical gyration 614115 AR 9q34.12 LAMC3 1 abnormalities Adenosine kinase deficiency 614300 AR 10q22.2 ADK 2 Ohdo syndrome 603736 AD 10q22.2 KAT6B 4 Hypomyelinating leukoenchephalopathy 607694 AR 10q22.3 POLR3A 3 Hypomyelinating leukoenchephalopathy 614381 AR 12q23.3 POLR3B Autosomal recessive spastic paraplegia 52, with ID 614067 AR 14q12 AP4S1 1 Seckel syndrome 613823 AR 15q21.1 CEP152 1 KBG syndrome 148050 AD 16q14.3 ANKRD11 3 Combined malonic and methylmalonic aciduria 614265 AR 16q24.3 ACSF3 1 Myhre syndrome 139210 AD 18q21.1 SMAD4 2 Nonsyndromic autosomal recessive intellectual 614020 AR 19p13.12 TECR 6 disability Adams-Oliver syndrome 614194 AR 19p13,2 DOCK6 4 3M syndrome 614205 AR 19q13.32 CCDC8 3 Bohring-Opitz syndrome 605039 AD 20q11 ASXL1 3 Congenital disorder of glycosylation 614507 AR 1p36.12 DDOST Autosomal recessive primary microcephaly 614673 AR 4q12 CEP135 Weaver syndrome 614421 AD 7q36.1 EZH2 Autosomal recessive spastic paraplegia 54, with ID - AR 8p11.23 DDHD2

Hyperphosphatasia with ID 239300 AR 9p13.3 PIGO Genitopatellar syndrome 606170 AD 10q22.2 KAT6B

Microcephaly with/without chorioretinopathy, 152950 AD 10q23.33 KIF11 lymphedema, ID Autosomal dominant syndromic ID - AD 11 q13.1 -q13.2 PACS1

Wiedermann-Steiner syndrome 605130 AD 11q23.3 MLL Cantü syndrome 239850 AD 12p12.1 ABCC9

Autosomal dominant ID Acrodysostosis with ID 614613 AD 15q11.2-q12.1 PDE4D

Floating-Harbor syndrome 136140 AD 16p11.2 SRCAP CHIME syndrome 280000 AR 17p11.2 PIGL Congenital hypothyroidism 614450 AD 17q11.2 THRA Mandibulifacial dysostosis with microcephaly 610536 AD 17q21.31 EFTUD2 Coffin-Siris syndrome (MRD12, MRD14, MRD15, 614607 AD 1 p36.11 ARID1A MRD16, Nicolaides-Baraiser syndrome) 614562 AD 6q25.3 ARID1B 601358 AD 9p24.3 SMARCA 614609 AD 19p13.2 SMARCA 614608 AD 22q11.23 SMARCB

lnher.= mode of inheritance. AD= autosomal dominant. AR= autosomal recessive. ID= intellectual disabilit; exomes in the original study used to idenlify the ID gene. C H A P T E R 1

Figure 6 Total number of genes involved in ID phenotypes and identified from 1980-2012

*For 2012, this is an expected total number based on the first six months of 2012. Colors represent the different inheritance modes: gray for autosomal recessive ID genes (AR), turquoise for autosomal dominant genes (AD) and purple for X-chroimosomal genes (X-linked).

1.4.6.5 Massive parallel sequencing for recessive gene Identification Exome sequencing has aiso proven effective in consanguineous families with multiple affected individuals, Caliskan et a/.135 were the first to report gene Identification in autosomal recessive ID by exome sequencing. They studied a consanguineous family with 15 siblings of which five were diagnosed with non-syndromic ID and identified a homozygous leading to an amino acid substitution in TECR, encoding a synaptic glycoprotein, in a homozygous locus on chromosome 19p13. Several other reports of gene identification by exome sequencing in autosomal recessive ID have been published since, mostly in single consanguineous families with an previously determined genomic disease locus 136-139 or |n multiple (non-consanguineous) families with a recognizable recessive

26 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

ID syndrome140'142. The first study that systematically analyzed a large group of random families with syndromic and non-syndromic autosomal recessive ID appeared in 2011110. The authors centered their approach around conventional homozygosity mapping in 136 large consanguineous families mostly originating from Iran. All genes in homozygous regions that had previously been detected by SNP array homozygosity mapping were interrogated by massive parallel sequencing. Pathogenic mutations were identified in 23 genes (in 26 families) that had previously been implicated in ID or related neurological disorders, as well as probable disease causing variants in 50 novel candidate genes for ID in 52 families. Proteins encoded by several of these genes directly interact with proteins encoded by ID genes. Many others are involved in transcription and translation, cell cycle control and energy metabolism, all crucial cellular processes that seem to be vital for normal central nervous system functioning and development. Only two out of the 50 novel candidate genes, ZNF526 and ELP2, carried mutations in more than one family, confirming the heterogenic origin of ID.

1.4.6.6 Exome sequencing in diagnostics The strength of MPS in disease gene discovery for different inheritance models in ID has been shown over the recent years (reviewed by Topper et al., Ku et al., Bamshad et al. and Veltman & Brunner143'146). Especially exome sequencing, with its unbiased genome wide approach has proven its success, not only for ID, but for essentially all mendelian disorders147'148. It is not surprising that exome sequencing, that is better affordable than genome sequencing, is being implemented in a diagnostic setting, replacing Sanger sequencing, in medical centers throughout the world. Exome sequencing is expected to improve diagnostic yield and will especially prove its use in diagnostic testing of genetically and clinically heterogeneous disorders149. For these monogenic disorders, like Noonan syndrome, Leber congenital amaurosis, retinitis pigmentosa and of course ID, one by one Sanger sequencing of specific genes associated with the disease was until now the only method to obtain a genetic diagnosis. Exome sequencing offers the possibility to investigate all known disease genes for a given disorder at once and furthermore, if no mutations are detected after initial screening of known genes, it offers the possibility to look beyond these for potential pathogenic mutations in the remaining coding genome. The downside of this approach, if regarded as such, may be the possibility of unexpected findings of medical relevance unrelated to the original medical question of the patiënt. Therefore patients should be counseled in an appropriate way for the chance of such a finding. Exome sequencing, eventually surpassed by whole genome sequencing, in a diagnostic setting, brings diagnostic testing and research closer together than ever before and promises an exciting time for understanding the molecular basis of human diseases.

27 CHAPTER 1

1.5 Scope and outlime of this thesis

1.5.1 Why do we study genetics of ID? This thesis focuses on individuals with ID with a suspected underlying genetic origin. Understanding of the genetics underlying ID is of great value for the patiënt and his/her family as well as for his physicians. In the first place because receiving a diagnosis ends an often long, intense and uncertain diagnostic process for the patiënt and his/her family in search for the cause of the disorder as it answers the question why disease occurred. This information allows the involved physicians to optimize patiënt management and subsequent determine prognosis and personalized treatment options. It also allows for counseling in regard to family planning. Importantly, identification of the underlying genetic defect is often the starting point for understanding the pathobiology behind the observed phenotype and development of the human brain in general, which is mandatory to initiate development of therapeutic intervention. Lastly, studying ID is relevant because it is a leading medical and socio-economic problem due to its high frequency (1-3%) and lifelong care that needs to be provided.

1.5.2 Aim The main aim of this thesis is to identify the underlying genetic defect in families with ID of unknown etiology. Apart from the medical relevance of establishing a molecular diagnosis, gene identification in ID provides a basis for further investigations aimed to understand the biological consequences of these genetic defects and how this relates to the observed phenotype.

1.5.3 Research approach Autosomal recessive forms of ID might contribute up to 25%150‘153 of genetic causes of ID. Nevertheless, genetic research in this area has been relatively unsuccessful especially in non-syndromic ID as compared to research in X-linked ID. We aimed to identify novel autosomal recessive ID genes by approaches parallel to the existing studies in consanguineous families. Focus in the first three chapters (Chapter 2 to 4) lies on a specific patiënt population, namely non-consanguineous families with siblings affected with ID. Families with multiple affected siblings comprise 6% of more than 4,000 families in our in-clinic phenotype database (for characterization of individuals with ID in the Nijmegen phenotype database, see Table 4). These Western families often consist of only two to three affected individuals and are mostly of non-consanguineous origin. In Chapter 2 we describe an alternative use of homozygosity mapping in ten non-consanguineous affected brother-sister pairs to identify homozygous regions that might harbor mutations. Homozygosity mapping in such small, non-consanguineous families is performed

28 INTELLECTUAL DISABILITY: A GENERAL .NTRODUCTiON

Table 4 Number of ID families collected in the Nijmegen phenotype database

Mild Moderate Severe Unknown Total

Isolated cases (total) 845 (22%) 611 (16%) 753 (20%) 1583 (42%) 3792 (100%)

Isolated cases of non- 828 (22%) 599 (16%) 730 (20%) 1557 (42%) 3714 (100%) consanguineous origin

Isolated cases of 17 (22%) 12(15%) 23 (29%) 26 (33%) 78 (100%) consanguineous origin

Male isolated cases 499 (22%) 342 (15%) 423 (19%) 974 (44%) 2238 (100%)

Sibling families (total)3 73 (29%) 50 (19%) 78 (31%) 54 (21%) 255 (100%)

Siblings families of non- 66 (29%) 48 (21%) 65 (28%) 52 (22%) 231 (100%) consanguineous origin

Siblings families of 7 (29%) 2 (9%) 13 (54%) 2 (8%) 24 (100%) consanguineous origin

Brotherpairs families 19 (40%) 6 (13%) 16 (34%) 6 (13%) 47 (100%)

Number of ID families collected in the Nijmegen phenotype database and seen in the genetics clinic ol the Radboud University Medical Centre The respective distribution of ID level is shown Sibling families constitute 6% of ID families (255 of 4047) in the Nijmegen phenotype database Isolated males are over-represented, with a female:male ratio of 1: 1.4. Similarly, brotherpair families are over represented as compared to sisler-sister and brother-sister families. In isolated cases, the ID level is not known in 42% of individuals, whilst in the siblings IQ-level is not known in 22%, nevertheless the distribution of mild, moderate, and severe ID is very comparable between the isolated cases and sibling families a of 103 sibling families the gender of both affecled siblings is known. of which 47 brother pairs.

under the assumption that a homozygous mutation in a recessive disease gene is passed on to the affected child by both parents who received the mutant allele from a common ancestor. This approach had already been used for other disorders such as retinitis pigmentosa, steroid resistant nephrotic syndrome and nephronop- thisis154'156. Homozygous regions in non-consanguineous families are smaller in size and fewer in number than in consanguineous families and this approach was expected to make candidate gene selection and gene identification feasible. On average 21 homozygous regions over 1 Mb in size were identified per family and overlapping homozygosity was found with four previously reported loei for non-syndromic autosomal recessive ID, MRT8 to MRT11, indicating that this approach might contribute in the search for autosomal recessive ID genes. As technology has evolved rapidly, in Chapter 3 we applied an unbiased genome wide exome sequencing approach to systematically study 20 small sibling families including four families of Chapter 2, mostly of non-consanguineous origin157. The advantage of exome sequencing over the homozygosity mapping

29 CHAPTER 1

approach is that families could be investigated for both homozygous and compound heterozygous recessive mutations. Brotherpair families were in addifion analyzed for hemizygous mutations on the X-chromosome. In Chapter 3 we propose a mutation classification strategy for interpretation of pathogenicity based on mutation and gene characteristics. Pathogenic recessive mutations were discovered in three genes, DDHD2, SLC6A8 and SLC9A6, of which the latter two have previously been implicated in X-linked ID. Potentially pathogenic mutations were identified in five autosomal genes MCM3AP, PTPRT, SYNE1, TDP2 and ZNF528 and one X-chromosomal gene, BCORL1. In Chapter 4, we provide further evidence of pathogenicity for one of the novel candidate ID genes that was identified in Chapter 3, DDHD2, encoding an intracellular phospholipase Ai. Mutations in DDHD2 were identified in three additional families, confirming that mutations in this gene give rise to a novel syndromic phenotype consisting of intellectual disability, spastic paraplegia and a pattern of brain abnormalities consisting of a thin corpus callosum accompanied by subtle periventricular white matter hyperintensities and an abnormal lipid peak as measured by proton MR spectroscopy158 (Chapter 4). We studied the cellular morphology in EBV transformed leucocytes of affected individuals. Fruitfly larvae with knock down of Ddhd2, the fruitfly orthologue of human DDHD2, were studied to gain insight in DDHD2 gene function in the central nervous system. This showed a reduced number of active zones at the synaptic terminals in the larvae with Ddhd2 knockdown, supporting a role for human DDHD2 in synaptic organization and transmission. The proton MR spectroscopy abnormalities that where observed in three out of four families can be used as a clinical measure to differentiate the DD/-/D2-related ID-spasticity phenotype from other such phenotypes. Chapters 2 to 4 show that small sibling families seem perfectly suitable to identify recessive (autosomal and X-chromosomal) pathogenic alleles and are a valuable sub- population for ID gene identification. Exome sequencing in well-characterized syndromic forms of ID has proven to be successful. Chapter 5 describes application of family-based exome sequencing for de novo mutation detection in a hit hero unknown syndromic ID phenotype. Identical de novo mutations in PACS1, encoding a trans-golgi membrane traffic regulator, were identified in two boys from unrelated families who presented with a striking resemblance of facial dysmorphisms159. Zebrafish overexpression studies of the mutant protein induced a craniofacial phenotype in the zebrafish embryos consistent with the human pathology thus supporting causality of the mutation. Further studies showed that the zebrafish craniofacial phenotype was driven by aberrant specification and migration of cranial neural crest cells, suggesting that perturbation of neural crest migration contributed to the observed craniofacial phenotype in the boys159.

30 INTELLECTUAL DISABILITY: A GENERAL INTRODUCTION

Chapter 6 provides a discussion of the results described in the preceding chapters of this thesis, their implications and future perspectives and challenges in ID research and diagnostics.

31

Homozygosity mapping in outbred families with Mental Retardation

The European Journal of Human Genetics, 2011, volume 19 (5): p. 597-601

Janneke H.M. Schuurs-Hoeijmakers. Jayne Y. Hehir-Kwa, Rolph Pfundt, Bregie W.M. van Bon, Nicole de Leeuw, Tjitske Kleefstra, Michèl A, Willemsen. Ad Geurts van Kessel. Han G. Brunner, Joris A. Veltman, Hans van Bokhoven, Arjan P.M. de Brouwer, Bert B.A. de Vries

Department of Human Genetics, Radboud University Nijmegen Medical Centre, Nijmegen, the Netherlands CHAPTER 2

Abstract

Autosomal Recessive Mental Retardation (AR-MR) may account for up to 25% of genetic MR. So far, mapping of AR-MR genes in consanguineous families has resulted in six non-syndromic genes whereas more than 2,000 genes might contribute to AR-MR. We propose to use outbred families with multiple affected siblings for AR-MR gene identification. Homozygosity mapping in ten outbred families with affected brother-sister pairs using a 250K SNP array revealed on average 57 homozygous regions over 1Mb in size per affected individual (range 20-74). Of these, 21 homozygous regions were shared between siblings on average (range 8-36). None of the shared regions of homozygosity (SROHs) overlapped with the non-syndromic genes. Thirteen SROHs had an overlap with previously reported loei for AR-MR, namely with MRT8, MRT9, MRT10 and MRT11. Among these was the longest observed SROH of 11.0 Mb in family ARMR1 on chromosome 19q13, which had 2.9 Mb (98 genes) in common with the 5.4 Mb MRT11 locus (195 genes). These data support that homozygosity mapping in outbred families may contribute to identification of novel AR-MR genes.

34 HOMOZYGOSITY MAPPING IN OUTBRED FAMILIES WITH MENTAL RETARDATION

introduction

Mental Retardation (MR), also referred to as Intellectual Disability (ID), is a common neurodevelopmental disorder affecting approximately 1-3 % of the general population1. Clinically, DSM-IV defines MR as significant sub average intellectual functioning -Intelligence Quotients (IQ) below 70- with an onset before the age of 18 years and impairment in adaptive functioning such as self care, social and interpersonal skills and work [Am. Psych. Ass. 1994. DMS-IV.]. MR can be subdivided into syndromic and non-syndromic forms based upon the presence or absence of additional features, although this distinction is clinically not always obvious. The etiology of MR is heterogeneous and despite recent improvement of cytogenetic and molecular technologies, less than 50% of patients have an etiological diagnosis in clinical practice, hampering medical care and prognosis of the patiënt and genetic counseling of the families17' 201. Genetic causes contribute significantly to MR and among these autosomal recessive inheritance may account for a substantial part of this disorder. Although no recent estimations have been made, autosomal recessive genes have previously been estimated to account for up to 25% of unexplained MR150'153. This is more than two times as frequent as the contribution of single X-chromosomal genes to MR, which is estimated to explain MR in 10% of affected males103' 10S' 202. For practical reasons, over the last decades the focus in genetic MR research has been on X-linked MR, leading to the identification of 90 disease genes on the X-chromosome of which 38 (42%) lead to non-syndromic MR12. In contrast, of the 348 genes contributing to AR-MR phenotypes, only six genes (1.6%) have been identified giving rise to non-syndromic AR-MR (OMIM: http://www.ncbi.nlm.nih.gov/ omim). Based on the number of MR genes on the X chromosome -11% of X-chromosome protein coding genes are implicated in X-linked MR- we estimate that there are about 2,000 AR-MR genes (11% of all 18625 autosomal protein coding genes) ([Vega v37; Sept 2009]; http://vega.sanger.ac.uk/Homo_sapiens/lnfo/ Index). The limited number of non-syndromic AR-MR genes have mostly been elucidated through studies of consanguineous families, large enough to perform linkage analysis resulting in significant LOD-scores. As these families are rare and most often originate from geographical regions where consanguineous marriages are common, other, parallel approaches for the identification of genes causing this heterogeneous condition are required. Recent technological advances, such as whole genome homozygosity mapping with high resolution single nucleotide polymorphism arrays combined with next generation sequencing, enable the analysis of outbred families (brother-sister or sister-sister pairs), which are more common than the consanguineous families used thus far. Homozygosity mapping in such small outbred families is performed under the assumption that a

35 CHAPTER 2

homozygous mutation in a recessive disease gene is passed on to the affected child by both parents who received the mutant allele from a common ancestor. Previous studies in other and less heterogeneous autosomal recessive disorders, such as retinitis pigmentosa, steroid resistant nephrotic syndrome and nephronop- thisis have shown the applicability of homozygosity mapping for identification of mutations in families and isolated patients without consanguinity154"156. The percentage of homozygous mutations in these cohorts might be as high as 70%156. Here, we report the first study of homozygosity mapping in outbred MR families.

Patients, Materials & Methods

Families Ten families with brother-sister pairs with MR were included in this study (in total 22 individuals). Eight families consisted of two affected individuals. In two families, three individuals were analyzed: in family ARMR1 a brother-sister pair with an unaffected brother and in family ARMR10 monozygotic twin brothers and their sister. Patients were clinically evaluated by a clinical geneticist in the human genetics department of the Radboud University Nijmegen Medical Centre in Nijmegen, the Netherlands (Table 1). The study was approved by the Medical Ethics Committee of the Radboud University Nijmegen Medical Centre.

Homozygosity mapping Patiënt DNA was isolated from lymphocytes as described by Miller et al203. Samples were hybridized on an Affymetrix A/spl 250K SNP array. SNP array experiments were performed according to the manufacturer’s protocols (Affymetrix, Santa Clara, CA, USA). Copy number estimates were determined using the CNAG software package (v2.0) to exclude causative copy number aberrations204. Genotypes were called by Affymetrix Genotype Console Software v2.1. All hybridizations had >85% successful genotype calls. To systematically identify runs of homozygous called SNPs in the 22 siblings, 19 MR patients from consanguineous parents, 817 MR patients from non-consan- guineous parents and 159 healthy Controls from an outbred population, PLINK v1.06 was used205. In each window of 50 SNPs, up to five SNPs with a missing call and a maximum of two heterozygous called SNPs were allowed. We determined i) regions of at least 1 Mb that contained a minimum of 50 contiguous, genotyped homozygous SNPs, ii) regions s 1.5 Mb and containing a minimum 75 contiguous SNPs, iii) regions > 2 Mb and containing a minimum of 100 SNPs and iv) regions > 5 Mb and containing a minimum of 250 SNPs. For allelic matching, segments were compared pairwise, and samples were grouped into the same haplotype

36 HOMOZYGOSITY MAPPING IN OUTBRED FAMILIES WITH MENTAL RETARDATION

Table 1

Family Gender Age (years) MR level Additional clinical findings

ARMR 1 male 48 moderate autism

female 38 severe autism

ARMR 2 male 10 mild

female 5 mild hydrocephalus with macrocephaly

ARMR 3 male 17 mild-moderate behavioural problems

female 17 mild-moderate papiloedema

ARMR4 male 5 moderate

female 13 moderate pectus carinatum

ARMR5 male 10 mild-moderate

female 6 mild-moderate

ARMR6 male 6 mild blepharophimosis

female 11 mild

ARMR7 male 10 mild

female 15 mild

ARMR8 male deceased severe facial dysmorphisms, kyphoscoliosis, rigidity with contractures, hypertrichosis

female 20 severe facial dysmorphisms, kyphoscoliosis, rigidity with contractures, hypertrichosis, hip dysplasia, myopia, hearing impairment

ARMR9 male 7 moderate ear pits, obesitas

female 6 mild-moderate obesitas

ARMR10 male 7 severe autism, hypokinetic rigid syndrome

male 7 severe autism, hypokinetic rigid syndrome

female 10 severe autism, hypokinetic rigid syndrome

group if at least 0.95 of jointly non-missing, jointly homozygous SNPs were identical. Regions that were homozygous in both sibs, but having different haplotypes, were e xclu d e d .

37 CHAPTER 2

Results and Dïscussiom

Clinical data We studied ten families with mild to severe MR (families ARMR1-10, Table 1). All families had one affected male and one affected female, except for ARMR10. This family had an affected female and two affected males, the latter two being monozygotic twins. Additional clinical features were observed in eight families, ranging from minor clinical features such as pectus carinatum in ARMR4 to major clinical features such as hypokinetic rigid syndrome with severe autism in AMR10.

Homozygosity mapping

Individual regions of homozygosity (ROHs) Population studies often use a size cut-off value for the detection of regions of homozygosity (ROHs) of 1 Mb206'209. Homozygous mutations in outbred families have been identified in regions as small as 2.1 Mb and mutations in consanguineous families are usually found in large linkage intervals encompassing several to tenths of megabases107' 156' 210. Therefore we choose to analyze the ten outbred families with four different ROH size cut-off values, namely 1 Mb, 1.5 Mb, 2 Mb and 5 Mb. The number of ROHs over 1 Mb in size varied from 20 in the male patiënt of ARMR3 to 74 in the male patiënt of ARMR1, and was on average 57 per individual (Table 1). We observed on average 14 ROHs over 1.5 Mb (range 2-22) and 4 ROHs over 2 Mb (range 0-9) in size. ROHs over 5 Mb in size were observed in three families (ARMR1, 7 and 8). Notably the female patiënt of ARMR7 has seven ROHs considerably larger than 5 Mb (Range: 6.4 to 27.1 Mb) indicating that in this family the parents are likely related, although her affected brother has only one homozygous stretch of 6 Mb. The average of 57 ROHs per individual is in line with previous reports showing on average 31 ROHs in healthy individuals (range 0-115 in 2,429 individuals)206-207’209. To test whether the siblings were truly from outbred families, we compared the total amount of genomic homozygosity (all regions > 1 Mb) of the siblings with (i) 19 MR patients from consanguineous parents, (ii) 817 MR patients from non-consanguin- eous parents and (iii) 159 healthy Controls from an outbred population. With Student’s t-test, the siblings having on average 178 Mb (SD: 75 Mb) of the genome homozygous, differed significantly (p=8.34 E-08) from the consanguineous MR patients, who had on average 358 Mb (SD: 98 Mb) of their genome homozygous. There was no significant difference between the siblings and the non-consanguin- eous MR patients (p=0.441, average homozygosity 191 Mb, SD: 49 Mb) or the healthy Controls (p=0.337, average homozygosity 194 Mb, SD: 74 Mb). This shows that the total amount of genomic homozygosity of the siblings, in the current study, is not different from that observed in an outbred population.

38 HOMOZYGOSITY MAPPING IN OUTBRED FAMILIES WITH MENTAL RELARDATION

Shared regions of homozygosity (SROHs) We observed shared regions of homozygosity (SROHs) in the siblings in all 10 families and categorized these SROHs according to sizes longer than 1 Mb, 1.5 Mb, 2 Mb and 5 Mb, respectively (Figure 1). On average, siblings shared 21 regions longer than 1 Mb (range: 8-36) which represents almost 40% of the individual ROHs per sib (Table 2). Siblings shared on average four regions over 1.5 Mb in size (average: 27% of individual ROHs) and one region over 2 Mb in size (average: 25% of individual ROHs). Two families, ARMR1 and 8, showed one single SROH longer than 5 Mb; 11.0 Mb on chromosome 19q13 and 8.4 Mb on chromosome 6q23, respectively. Based on Mendelian inheritance, siblings are expected to share 25%

Figure 1 This graph shows the number of shared regions of homozygosity between siblings (SROHs) in each famiiy, categorized by different size cut-off values for ROH detection (ROH size >1 Mb, > 1.5 Mb, > 2 Mb, > 5 Mb)

40 | SROH; >1 Mb

| SROH; > 1,5 Mb 35 □ SROH; > 2 Mb

] SROH; > 5 Mb 30

25

20

15

10

ARMR1 ARMR2 ARMR3 ARMR4 ARMR5 ARMR6 ARMR7 ARMR8 ARMR9 ARMR10 Mean

The number of SROHs of 1 Mb or longer varies from eight SROHs in ARMR3 to 36 in ARMR6 (mean 21), this number is reduced on average to four SROHs of 1.5 Mb or longer, while only two families, ARMR1 and ARMR8 have an SROH over 5 Mb in size. Black: SROHs v^ith size cut-off for ROH analysis of >1 Mb, dark grey: SROHs size cut-off >1.5 Mb, light gray: SROHs size cut-off >2 Mb, white: SROHs size cut-off > 5 Mb.

39 CHAPTER 2

Table 2 Summary of the number of reported homozygous regions in 10 AR-MR families with a size cut-off value of 1 Mb for homozygous regions. Shown are the individual numbers of homozygous runs (ROH), the shared homozygous runs between siblings of each family (SROH), and the shortest and longest SROH for each family, with two families having a relatively long SROH of over 5Mb in size.

Family ROH female ROH male SROH; S hortest Longest >1 Mb SROH SROH

ARMRIa 64 74 18 0.3 Mb 11.0 Mb

ARMR2 62 48 18 0.4 Mb 3.0 Mb

ARMR3 33 20 8 1.0 Mb 1.9 Mb

ARMR4 68 59 26 0.9 Mb 2.7 Mb

ARMR5 37 62 11 0.8 Mb 2.3 Mb

ARMR6 64 84 36 0.7 Mb 2.3 Mb

ARMR7 72 67 35 0.9 Mb 2.4 Mb

ARMR8 69 68 28 0.9 Mb 8.4 Mb

ARMR9 43 51 18 0.6 Mb 2.9 Mb

ARMR10b 33 65 13 1.0 Mb 1.9 Mb

Mean 54.5 59.8 21.1

% SROH shared 36.8%

a Homozygosity mapping of an unallecled brolher reduced the amount ot SROHs in Ihe patienls of ARMR1 from 33 to 18 regions b ARMR10: data of the female patiënt and one of the monozygotic Iwin brothers is shown.

of their individual ROHs. Therefore, the 40% overlap of ROHs longer than 1Mb observed between sibs is most likely partly due to false positive ROHs. However, the rate of false positive ROHs drops as the size of the ROHs increases, as we observed for the 1.5 and 2 Mb regions with 27% and 25% homozygous regions shared between siblings, respectively. In ARMR1, consisting of four siblings (an affected brother and sister and two healthy brothers), we also genotyped one of the healthy brothers. In this family, 18 ROHs longer than 1 Mb were shared between the patients but not with the healthy sib, thereby reducing the number of candidate regions by 55%. Notably, the longest SROH of 11.0 Mb on chromosome 19q13 was heterozygous in the healthy brother. For all SROHs over 2 Mb in size (12 in total), we compared haplotypes of the ARMR families with haplotypes of the 817 non-consanguineous MR patients and the 159

40 HOMOZYGOSITY MAPPING IN OUTBRED FAMILIES WITH MENTAL RETARDATION

healthy Controls to see whether these haplotypes were unique or part of a frequently occurring haplotype. We considered a haplotype as shared between individuals when there was a minimal overlap of 2 Mb. Three SROHs had a unique haplotype (ARMR1, chr16: 78.198.864-80.847.224; ARMR8, chr6: 130.485.817-138.854.223 and chr9: 131.487.840-133.745.038, http://genome.ucsc.edu/, hg18), among which the second longest SROH of 8.4 Mb on chromosome 6 in ARMR8. The 11.0 Mb SROH of family ARMR1 (chr19: 38.737.515-48.316.888) had a 2.3 Mb overlap with one patiënt (chr19: 387.37.515-41.057.851), the haplotype of the remaining 8.7 Mb of this 11.0 Mb SROH showed no overlap. Another SROH of 2.6 Mb on chromosome 6 in ARMR9 had an overlap with three patients (chr6: 62.030.184-64.647.424). Six SROHs had a more common haplotype occurring in 2.6 to 14.4 % of the 976 studied individuals (Table 3).

Shared regions of homozygosity (SROHs) and known AR-MR loei None of the reported SROHs encompassed one of the six reported non-syndromic AR-MR genes or encompassed syndromic AR-MR genes that could explain the phenotype in these families. Overlap of SROHs with known AR-MR loei was observed for MRT7 in ARMR1 and 2, for MRT8 in ARMR2, 3, 6 and 7, for MRT9 in ARMR7, for MRT10 in ARMR1, 6, 7, 8 and 10, and for MRT11 in ARMR1 (OMIM: #611093, %611094, %611095, %611096, %611097) (Figure 2, Table 4). The size of these homozygous segments varied from 950 kb to 1.9 Mb, except for the segment in overlap with MRT11 which was 11.0 Mb in size (Figure 2). Families ARMR2, 3 and 6 shared the same haplotype in the MRT8 locus (1.1 Mb overlap: rs16929951; rs16930750, 15 genes) as did ARMR1, 6, 7, 8 and 10 for the MRT10 locus. Since more than 2000 genes might contribute to AR-MR and the mutation frequency of each single gene is presumably below 0.1%, each family in this study will most probably have a unique genetic defect, giving rise to the MR. Therefore, the overlap of several families with part of the MRT7, MRT8 and MRT10 loei, is unlikely to contain the causative genetic defect in all families, but likely to reflect common regions of homozygosity as reported by Lencz et a/208. In their study, 9% and 15% of 144 healthy individuals were homozygous for MRT8 and MRT10, respectively. Of more interest is the overlap of the single families with the MRT9 and MRT11 loei. Especially the latter, as this overlap of 2.9 Mb with MRT11 is part of an 11.0 Mb homozygous region (388 SNPs: rs1864132; rs16959168) that is shared between the affected siblings of family 1 and is heterozygous in the unaffected brother. Besides an individual 5.4 Mb ROH in the female patiënt of this family, this 11.0 Mb ROH is the only homozygous stretch exceeding 5 Mb in both siblings. The MRT11 locus (MIM: %611097), reported by Najmabadi et al.107, was mapped in a consanguineous Iranian family with four patients with non-syndromic, moderate MR, and a maximum lod score of 4.0. The MRT11 candidate region is a 5.4 Mb region between rs2109075

41 Table 3 Comparison of the haplotypes of SROHs over 2 Mb in size between ten ARMR families a (817 MR patients + 159 Controls)

family size SROH chr position (hg18) overlapping size overlapping hap haplotypes (position)

1 ARMR1 2.6 Mb 16 78198864-80847224 0 unique haplotype

2 ARMR1 11.0 Mb 19 38737515-49717305 1 2.3 Mb (38737515-410

3 ARMR4 2.5 Mb 4 32282106-34822231 141 2.0-2.5 Mb (32282106

4 ARMR4 2.7 Mb 6 26650873-29363865 25 2.1-2.7 Mb (26650873-

5 ARMR4 2.2 Mb 11 48100690-50374125 135 2.1-2.2 Mb (48100690

6 ARMR7 2.4 Mb 6 26261314-28708471 26 2.0-2.4 Mb (26261314-

7 ARMR7 2.4 Mb 7 117401456-119828493 31 2.1-2.4 Mb (117401456

8 ARMR8 8.4 Mb 6 130485817-138854223 0 unique haplotype

9 ARMR8 2.2 Mb 9 131487840-133745038 0 unique haplotype

10 ARMR8 2.4 Mb 11 47841423-50256441 136 2.1-2.2 Mb (47841423-

11 ARMR9 2.6 Mb 6 62030184-64651646 3 2.1-2.6 Mb (62030184-

12 ARMR9 2.9 Mb 8 47043376-49982983 104 2.0-2.9 Mb (47043376 chr. = chromosome. Ireq.= frequency HOMOZYGOSITY MAPPING IN OUTBRED FAMILiES WITH MENTAL RETARDATION

Figure 2 SROHs showing overlap with MRT7-10 (http://genome.ucsc.edu/. hg18)

chr!9

Sca le 5 M bf- c h r i9 : 450000001 50000000| fiRMRi | MRTii-locus M RTii Chromosome Bands Localized by Fl? H Mapping Clones 1 9 q l3 .3 1 I9 c il3 .2 RefSeq Genes

RefSeq Genes | RTPlfl3 | ERF | CEFICFIMS | P3G5H LYPD3 | IRGC | ZNF230 | ZFP112R} CEflCftM5 | GRIK5 Ü4I CIC | PSG3 | PSG11 | PSG9 | XRCCl |H ZNF404 || ZNF226 | ZNF1S0 || CEFICFIM6 | ZNF574 | MEGF8 ■ PSGS | PSG2 | IRGQ | ZNF45 |) CEFICflMS H PSG8 I PSG5H PHLDB3 | KCNN4 ( ZNF2S4 | LYPD4 | > I PSGS j PSG4 | ETHE1 H LYPD5 [h ZNF225 H i I GSK3R | SI | PSG4 | ZNF575 | LYPD5 j CEfïCflM3t>| RFS19 || FRFFIH1B3 | PSG6 | CD 177 | SRRM5 | ZNF221 t{ ZNF; 3# CD79FI | PFIFBH1B3 j PSGS | TEX101h|J ZNF2S3H ZNF234 f CD79R j PFIFflHIBS j PSG7 | TEX101 | ZNF1554 ZNF235H REHGEF1 PRR19i PSG il (j ZNF576 | ZNF155 j ZFP112 |fl flRHGEFl • TMEM145 | PSGli ZNF576j ZNF222 | ZNF285R | FlRHGEFl I CNFN | ZNF42SD ZNF222 j CEFICflM^t> 1 RfiBfiCl | l i p e h CFIDM4 H ZNF223 | CEflCflM^ i P0U2F2 PLflUP }| ZMF234 | CEflCflMl | PLflURH ZNF226 | CEflCfiMiH PLftURH ZNF226 j C i9 o r fS i 1 ZNF226 |

A) Overlap of the MRT7 locus with ARMR1 and ARMR2. B) Overlap of the MRT8 locus with ARMR2, 3, 6 and 7. The first three families share the same haplotype. This region is also reported by Lencz ef a/20S to be homozygous in 9% of 144 healthy individuals. C) Overlap of the MRT9 locus with ARMR7. D) Overlap of the MRT10 locus with ARMR1, 6, 7, 8, 10. All families share the same haplotype. This region is reported to be homozygous in 15% of 144 healthy individuals by Lencz et al208. E) 11 Mb SROH of ARMR1, showing 2.9 Mb overlap with the MRT11 locus. F) enlargement of the 2.9 Mb overlap containing 98 genes.

43 Table 4

Chromosome Position (hg18) Overlapping SROH

nonsyndromic AR-MR loei

MRT7 (OMIM: #611093) 8p12-p21.1 28758718-35262498 ARMR1: 33747848-35018711 ARMR2: 32706616-33655517

MRT8 (OMIM: %611094) 10q21.3-q22.3 71041135-80718164 ARMR2: 73625658-75091214 ARMR3: 73936224-75413532 ARMR6: 73572511-75091214 ARMR7: 77112878-78350873

MRT9 (OMIM: %611095) 14q12-q13.1 26578858-32780358 ARMR7: 30444093-31494901

MRT10 (OMIM: %611096) 16p12.1 -q12.1 22705353-48948887 ARMR1: 45065445-46760004 ARMR6: 45065445-46937833 ARMR7: 45065445-46760004 ARMR8: 45065445-46733997 ARMR10: 45780919-46732060

MRT11 (OMIM: %611097) 19q13.2-q13.32 46844069-52292281 ARMR1: 38737515-49717305

nonsyndromic AR-MR genes

PRSS12 4q26 119421865-119493370 -

CRBN 3p26.3 3166696-3196390 -

GRIK2 6q16.1 -q21 101953626-102624651 -

TUSC3 8p22 15442101-15666366 -

TRAPPC9 8q24.3 140811770-141537860 -

CC2D1A 19p13.12 13884981-13902692 - HOMOZYGOSITY MAPPING IN OUTBRED FAMILIES WITH MENTAL RETARDATION

and rs8101149 containing 195 genes, whereas the 2.9 Mb overlap reduces the number of candidate genes to 98 [Map Viewer, build 36.3]. The results obtained for ARMR1, with non-syndromic, severe MR, support that homozygosity mapping in outbred families might contribute to identification of novel AR-MR genes especially in combination with next generation sequencing technologies. We report the first study of homozygosity mapping in outbred MR-families to map AR-MR genes. Our data of ten AR-MR families, show that most outbred families share only relatively short homozygous regions (<5 Mb) with only two individual families sharing one relatively long homozygous region (8.4 and 11.0 Mb). Follow-up studies in these and other families combining data of SROHs with next generation sequencing will further show whether homozygosity mapping in outbred families is a useful approach to unravel the molecular basis of AR-MR.

Acknowledgements: We are grateful to the patients and their families for their support and cooperation. This work has been supported by grants from the Dutch Organisation for Health Research and Development (ZON-MW) (917-86-319 to B.B A.d.V.), Hersenstichting Nederland (B.B.A.d.V.).

Competing interest: the authors declare no conflict of interest.

45

Identification of recessive pathogenic alleles in small sibling families with intellectual disability

(Manuscript submitted)

Janneke H M Schuurs-Hoeijmakers1-3, Anneke T. Vullo-van Sillhout' 2-4. Lisenka E.L.M Vissers' 3 Ilse I.G.M. van de Vondervoort'. Bregje W.M. van Bon1. Joep de Ligt13. Christian Gilissen' 3. Jayne Y. Hehir-Kwa13, Kornelia Neveling13, Marisol del Rosario1, Gausiya Hira'. Santina Reitano4. Aurelio Vitello4. Pinella Failla4. Donatella Greco4. Marco Fichera4-5, Ornella Galesi4, Tjitske Kleefstra13, Marie T. Greally6. Charlotte W. Ockeloen', Marjolein H. Willemsen13, Ernie M.H.F. Bongers1-3, Irene M. Janssen1, Rolph Pfundl1, Joris A. Vellman13, Corrado Romano4, Michèl A. Wtllemsen7-8, Hans van Bokhoven1-3-8, Han G. Brunner1-3, Bert B.A. de Vries1-2 8, Arjan P.M. de Brouwer1-3-8

1 Department of Human Genetics, Radboud University Nijmegen Medical Centre. PO Box 9101,6500 HB Nijmegen, The Netherlands. 2 Institute for Genetic and Metabolic Disease, Radboud Urwersity Nijmegen Medical Centre, Nijmegen. The Netherlands 3 Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen. The Netherlands 4 Unit of Pediatrics and Medical Genetics, Unit of Neurology. Laboratory of Medical Genetcs IRCCS Assoaazone Oasi Maria Santissima, Troina, Italy 5 Medical Genetics, University of Catania, Catania, Italy 6 National Centre for Medical Genetics, Our Lady’s Children's Hospital, Crumlin. Dublin 7 Departments of Pediatrie Neurology, Radboud University Nijmegen Medica! Centre. Nijmegen, The Netherlands 8 Donders Institute for Brain, Cognition and Behavior, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands. CHAPTER 3

Abstract

Intellectual disability (ID) is a common neurodevelopmental disorder affecting 1-3% of the general population. Mutations in more than 10% of all human genes are considered to be involved in this disorder, of which the majority are not known. Recent studies have shown that monogenic causes have a major contribution to severe intellectual disability (ID), but that this monogenic constitution is highly heterogeneous in its origin. To identify recessive ID genes, we systematically investigated 20 small (non-consangulneous) sibling families with ID, a large group of patients that were previously less accessible for gene identification as the small family size precludes prior mapping of the genetic defect. By using exome sequencing, we identified pathogenic mutations in three genes, DDHD2, SLC6A8 and SLC9A6 of which the lattertwo have previously been implicated in X-linked ID. In addition, we identified potentially pathogenic mutations in BCORL1 on the X-chromosome and in MCM3AP, PTPRT, SYNE1, TDP2 and ZNF528 on the autosomes. We show that recessive pathogenic alleles can readily be identified in small sibling families thus emphasizing their value in identification of ID genes.

48 PATHOGENIC ALLELES IN SIBLINGS WITH ID

Introduction

Intellectual disability (ID) is a common disorder affecting 1-3% of the population in Western countries1. ID is clinically defined by significant limitations both in intellectual functioning -intelligence quotiënt below 70 - and in adaptive behavior that manifest before the age of 18 years5. In origin, ID can either be an acquired or a genetic condition. Genetic causes of ID encompass a plethora of defects from chromosomal abnormalities, imprinting disorders and single gene mutations to mitochondrial DNA mutations. Recent studies have indicated that autosomal de novo, single gene mutations have a major contribution to severe ID (up to 35%)115. 134, while mutations in X-linked genes account for 10 to 15 % of males with ID105. The contribution of recessive alleles to ID is considered to be as high as 25%150;152'211' 212. Only a minority of the autosomal recessive ID (ARID) genes has been identified so far considering the more than 2,500 genes that might be involved in ID157. Until now, only one study has systematically assessed multiple consanguineous families with nonspecific ARID to identify these missing ARID genes110. The authors centered their approach around conventional homozygosity mapping in large consanguineous Iranian families, All genes in the homozygous regions were interrogated by massive parallel sequencing. Use of these large consanguineous families to identify novel ID genes is limited by their specific ethnic background restricting these studies to countries with high rates of consanguineous marriages. In the Western population, however, ID families are mostly non-consanguineous and consist often of only two or three affected siblings. Families with multiple affected siblings comprise 6% of more than 4,000 families in our in-clinic ID cohort. This considerable group of small sibling families has not been systematically assessed so far, whereas they could contribute significantly to the identification of novel ARID genes if made accessible for research. Here, we have studied one exome per sibling family with (non)syndromic ID (n=20) to systematically determine the underlying genetic defect. The approach that we describe, not only contributes to the identification of (candidate) ID genes but also offers these families, with a high recurrence risk of 25%, the possibility to obtain a molecular diagnosis, genetic counseling and prenatal diagnosis.

Families and Methods

Families Clinical information and peripheral blood samples were collected from the families after obtaining written informed consent. Our research project was approved by the local ethics committee (Commissie Mensgebonden Onderzoek Regio Arnhem-

49 CHARTER 3

Nijmegen) according to the World Medical Association Declaration of Helsinki. All but one family had non-consanguineous parents and all had at least two affected siblings: 13 families consisted of affected brother-sister pairs, three families of affected sister-sister pairs, and four families of affected brother pairs (for clinical description see Table 1). Previous clinical and molecular evaluation by conventional karyotyping and array CGH had not lead to an etiological diagnosis.

Library preparation for exome sequencing Exome sequencing was performed on genomic DNA of one affected individual from each of the 20 families. Exome enrichment required 3/jg genomic DNA. Genomic DNA was captured with an AB SOLiD optimized SureSelect 50 Mb human exome kit (Agilent, Santa Clara, CA, USA) representing exonic sequences for —21,000 genes (including >99% genes of the CCDS version Sept. 2009 and >95% of RefSeq genes and transcripts version June 2010 - as specified by the company). Manufacturer’s instructions (version 1.5) for enrichment were followed with a minor modification, which was the reduction of the number of post-hybridization LM-PCR cycles from twelve to nine cycles. To allow for multiplexing libraries prior to sequencing, post-hybridization sample barcodes were used (Agilent, Santa Clara, CA, USA) that were compliant with SOLiD sequencing technology.

SOLiD sequencing and mapping Enriched exome libraries were equimolarly pooled in sets of four, based on a combined library concentration of 0.7 pM. Subsequently, the obtained pool was used for emulsion PCR and bead preparation using the EZbead system, foilowing manufacturer’s instructions (version 05/2010; Life Technologies, Carlsbad, CA, USA). For each pool of four exome libraries, a full sequencing slide was used on a SOLiD™ 4 System (Life Technologies, Carlsbad, CA, USA), thereby anticipating that all four samples would be represented by 25% of the total beads sequenced on the slide. Colour space reads were mapped to the hg19 reference genome with the SOLiD bioscope software version 1.3.

Calling and prioritization of sequence variants Sequence variations, including variations, were selected using quality settings that required the presence of at least two unique variant reads as well as the variation being present in at least 20% of all reads. Non-genic, intronic and synonymous sequence variants (but not canonical splice sites) were excluded as well as alleles found in >1% of the population based on dbSNPv134 and our local variant database consisting of data derived from 672 exome experiments. Under the hypothesis of a recessive inheritance model, we selected all rare variants in genes on the autosomes present either in presumed homozygous state (>80%

50 PATHOGENIC ALLELES IN SIBLINGS WITH ID

Congenital Neurological Other anomalies abnormalities Progressive polyneuropathy, cerebellar Hypotonia ataxia, saccadic eye movement Spasticity, MRI: neonatal hypomeylinisation Microcephaly Short stature, behavioral problems Hypokinetic rigid syndrome, contractures ATP production defect (1/2), hypotonia Microcephaly Ataxia, wheelchair bound Happy behaviour Autism with Gilles de la Tourette (1/3) Microcephaly Short stature, hirsutism, hypermobile joints, long fingers Dental caries, mild hypodontia (1/3), large pigmented mole (1/3) Microcephaly (3/5), Autism or behavioral problems vitium cordis (3/5), hernia (4/5), short stature (4/5)c diafragmatica (4/5), pectus extacatum (2/5) Small head circumference Spastic tetraplegia, cerebral visual Strabismus, short stature (-2 SD) impairment, nystagmus (< -2.5 SD), scoliosis, hip dysplasia, hypogonadism, elevated creatine/creatinine ratio EEG anomalies

Microcephaly (1/2) Hereditary spastic paraplegia, motor Myopia (1/2), axonal neuropathy, nystagmus vesicourethral reflux (1/2) (1/2), EEG anomalies (1/2), MRI: leukoencephalopathy, corpus callosum hypoplasia(1/2) MRI: multicystic encephalomalacia (1/5) Aggressive behavior (1/5) Hereditary spastic paraplegia, MRI: thin corpus callosum, subtle white matter hyperintersities, thoracal syrinx (1/2)

Thoracal asymmetry Spasticity, MRI: leukoencephalopathy Hydrocephalus with macrocepahly (1/2), Behavioral problems (1/2) MRhbilateral brainstem pathology (1/2) Eye movement disorder

53 CHAPTER 3

Figure 1 Flow diagram of variant selection procedure

20 sibling families \exome sequencing,r read mapping & variant calling (SureSelect 50 Mb human exome kit on SOLiD™ 4 system, bisscope version 1^3)

variant analysis

QC filtering • < 2 unique reads * • < 20% variant reads SNPs with frequency > 1% •dbSNP134 «In house database (N= : 672) norKjenic, intronic & synonymous Private rare variants • Brotherpairs-> > variants on X (>80% variation reads **) •All families->> homozygous (>80% variation reads **) & compound heterozygous variants I I validation & segregation (Sanger sequencing) norvvalidated gene function/OMIM disease phenotyi not segregating CNS expression mutation impact aa conservation (candidate) disease causing variant Sanger validation of variations with 2-4x coverage(*) and of variations with 70-80% variation reads (* in raw sequence reads Confirmed Confirmed % Confirmed % Confirmed * 2-4 variation Total homozygous heterozygous Not present homozygous heterozygous Subsitutions 16 2 12 2 12.5 75 31 12 10 9 38.7 32.5 Total 47 14 22 11 29.8 46.8 ** 70-80% variation reads Substitutions 38 2 27 9 5.3 71.1 Indels 1 0 0 1 0.0 0.0 Total 39 2 27 10 5.1 69.2

54 PATHOGENIC ALLELES IN SIBLINGS WITH ID

(Ct) values were within the range of DNA dilutions used to validate the primers. The melt curves of all PCR products showed a single PCR product. All water Controls were negative. Copy numbers were measured relative to SLC16A2 (NM_006517.3) on the X-chromosome. Differences in copy number of a genomic sequence of interest between the individual test sample and a control sample were calculated by the comparative Ct or 2AACt method216’217 after correction for gender.

Results

Candidate recessive variants We obtained on average 5.1 Gb of mappable sequence data per exome, resulting in an average median coverage of 58-fold. More than 85% of the targets was covered more than 10 times on average (Table 2, Figure 2). Comparison of sequence reads to the reference genome (GRCh37/hg19) showed in between 22,022 and 32,149 sequence variants per exome. Exclusion of non-genic, intronic and synonymous sequence variants (but not canonical splice sites) as well as alleles found in >1% of the population resulted in 230 rare variants on average per exome (range of 155-413; Table 2). Selection for candidate rare recessive variants revealed on average five candidate homozygous variations per exome (range 1-9), and eight candidate compound heterozygous variations (in four genes; range 1-10 genes; Table 2). This is about 2 times less candidates as compared to the candidate variants identified in an average exome of an affected individual from a consanguineous ID family using the exact same approach (Z. Iqbal, personal communication). For the brother pair families (n=4), we also considered an X-linked mode of inheritance. This resulted in on average four candidate variations on the X-chromosome (range 1-5).

Segregating recessive variants For all selected apparently recessive variations, we performed Sanger sequencing to validate presence of the variation in the proband and to study segregation of the variation within the families. Of note, 47 candidate variations had a 2-4x variation coverage of which 14 (30%) were confirmed as homozygous and 22 (47%) as heterozygous by Sanger sequencing, indicating that our cut-off of >2 variation reads for variant selection is crucial in order not to miss those candidate mutations (Figure 1). In total, homozygous variants in seven genes, compound heterozygous recessive variants in eight genes, and hemizygous variants in six genes segregated with the ID phenotype within the 20 families (Table 3 and 4).

55 CHAPTER 3

Table 2 Raw sequencing statistics and variant selection per exome per sibling family

Family W05-385 W06-0984 W07-1443

Total number of sequenced reads (xtO6) 133.9 126.9 120

Total number of mapped reads (x106) 108.9 90.9 87

Total number of bases mapped (Gb) 5.1 4.2 4.0

Total bases mapping to targets (Gb) 3.8 3.3 2.9

% targets with 10x coverage 84.89% 87.08% 82.33%

Mean target coverage (fold) 70.08 61.87 54.47

Median target coverage (fold) 54.77 45.27 42.9

All variants 28,017 26,661 23,899

QC filtering3 27,072 25,655 22,978

After exclusion of known variantsb 646 657 595

After exclusion of nongenic, intronic & synonymous variants 176 235 204

Of which fit a recessive model of diseasec

» compound heterozygousd 3 3 2

» homozygous 24 28 20

Of which present in raw sequencing data

» compound heterozygous 3 3 2

» homozygous 7 6 7

autosomal recessive segregating variants 3 2 0

Of which fit an X-linked model of disease0 ND ND ND

Of which present in raw sequencing data on chromosome X ND ND ND

X-linked segregating variants NDNDND

a 22 rare variant reads and 220% of all reads. b exclusion of varialions either presenl in dbSNP with 21% frequency or in our local dbSNP with 21% frequency, c 280% variant reads lor homozygous variations and variants on chromosome X, a the number of genes with compound heterozygous variants is given, e only variant tested, ND = nol determined.

Detection of homozygous deletions In addition to calling and prioritization of variants at base-pair level, we performed copy number variation (CNV) analysis on the exome data after mapping by use of cn.MOPS215 for detection of homozygous deletions. This analysis showed three potential homozygous deletions that we followed-up by genomic quantitative QPCR to validate the presence of de homozygous deletion in the proband and to test segregation with the ID phenotype. All three homozygous deletions were

56 F o i - ; : 3 ; « u . '- i . i " . si. . :ss v / i t i i ;d

W07-1585 W07-1601 W08-0135 W08-0748 W09-0070 W09-1095 W09-1109 W09-2166

148.5 150.3 137.7 123.5 133.2 110.9 155.4 100.1

123.9 117.5 112.5 95.6 110.9 87.8 127.3 78.5

5.9 5.5 5.1 4.4 5.1 4.1 6.0 3.7

4.6 4.2 4.0 3.28 3.8 3.2 4.8 2.9

82.07% 85.06% 87.91% 84.57% 86.74% 83.75% 85.60% 85.67%

88.4 80.18 74.01 61.28 69.07 62.14 93.58 53.54

68.15 62.82 56.67 48.32 53.21 48.67 73.26 39.99

27,939 23,829 28,993 26,161 28,517 22,022 25,075 25,557

27,275 22,990 28,095 25,292 27,695 21,324 24,254 24,693

770 484 624 569 1286 445 470 530

231 183 197 192 413 155 158 206

3 3 4 2 11 2 1 1

23 26 24 26 45 22 19 30

1 3 4 2 9 1 1 1

3 9 3 2 9 9 6 3

0 0 ND 1 0 3 0 0

ND 5 5 ND ND 6 ND ND

ND 5 4 ND ND 1 ND ND

ND 3 1e ND ND 0 ND ND

present in the proband but none segregated with the phenotype in the families and were therefore not regarded as pathogenic (Table 5, Figure 3).

Variant classification Variants that segregated with the ID phenotype were classified as pathogenic if they resided either in a known ID gene, or in a novel gene in which a second pathogenic mutation was identified in a patiënt with a similar phenotype upon

57 CHAPTER 3

Table 2 Continued

Family W10-1134 W10-1137 W10-1180

Total number of sequenced reads (x106) 147 144.1 136.4

Total number of mapped reads (x106) 124.8 121.5 106.4

Total number of bases m apped (Gb) 6.0 5.8 5.0

Total bases m apping to targets (Gb) 4.7 4.5 4.0

% targets with 10x coverage 87.43% 81.41% 87.39%

Mean target coverage (fold) 86.05 85.59 73.38

Median target coverage (fold) 65.93 65.27 55.15

All variants 30,194 28,416 24,295

QC filtering3 29,390 27,758 23,385

After exclusion of known variantsb 952 979 680

After exclusion of nongenic, intronic & synonymous variants 289 297 239

Of which fit a recessive model of disease0

» compound heterozygous 11 4 4

» homozygous 30 27 16

Of which present in raw sequencing data

» compound heterozygousd 10 4 4

» homozygous 5 3 1

autosomal recessive segregating variants 0 1 0

Of which fit an X-linked model of disease0 ND ND ND

Of which present in raw sequencing data on chromosome X ND ND ND

X-linked segregating variants ND ND ND

a a2 rare variant reads and £20% ot all reads, b exclusion ol variations either present in dbSNP with a1% frequency or in our local dbSNP with è1% frequency, c £80% variant reads for homozygous variations and variants on chromosome X, 0 the number of genes with compound heterozygous variants is given. e only variant lested, ND = not determined.

further study. This was the case for three out of the 20 families (Figure 4B-D). If there was no such finding, variants were classified as either potentially pathogenic or likely benign. Variants were labeled potentially pathogenic if they fulfilled four criteria: (i) the gene is already linked to a human neurologie phenotype other than ID or is not at all linked to a human phenotype as described in the online Mendelian inheritance in man database (OMIM; www.ncbi.nlm.nih.gov/OMIM/) (ii) there is mRNA expression of the respective gene in brain/neuronal tissue according to the

58 PATHOGENIC ALLELES IN SIBLINGS WITH ID

W10-1338 W10-1643 W10-2749 W11-0515 W11-3400 W11-3472 Averagi

189.24 126.5 114.0 163 165.3 142.5 138

144.51 102.9 90.1 133.8 126.4 118.7 110

6.7 4.9 4.2 6.4 5.7 5.6 5

5.8 3.8 3.6 5.0 4.3 4.22 4

88.9% 80.26% 83.63% 83.22% 85.95% 88.63% 1

99.9 73.19 48.2 95.26 79.5 79.17 74

74 54.58 61.39 73.77 60.78 60.06 58

30,182 26,258 23,212 28,535 26,375 32,149 26,831

29,117 25,704 22,417 27,854 25,351 31,172 25,999

568 656 489 710 716 1,672 944

180 202 199 222 257 304 230

5 5 4 5 2 10 4

27 20 27 31 42 17 27

4 5 3 4 2 10 4

3 2 5 6 9 2 5

2 0 ND 0 0 2 1

ND ND 3 ND ND ND -

ND ND 5 ND ND ND - ND ND 1e ND ND ND _

expressed sequencetags database (www.ncbi.nlm.nih.gov/dbEST/), (iii) the effect of missense variant(s) is predicted to be ‘disease causing/damaging' by either SNPs&Go218 and/or PolyPhen-2219, and lastly, (iv) the altered amino acid is conserved amongst vertebrates (Figure 5). Variants that did not fulfili all criteria were classified as likely benign (Table 3 and 4). In six out of 20 families we found variants that were classified as potentially pathogenic (Figure 4A and Figure 6).

59 CHAPTER 3

Figure 2 Coverage plots of exome experiments of sibling samples

100% —4— W10-1338 -O— W09-2166 90% - A — W11-3472 W09-1109 80% W08-0135 W07-1601 70% - l — W05-385 60% ------W10-1134 W09-0070 50% W08-0748 »— W07-1443 40% W09-1095

30% W10-1180 W10-1137 20% -i— W10-1643 W07-1585 10% W11-0515 W06-0984 0% W11-3400 W10-2749

100% 90%

80%

70% -■ -W08-0135 60% -

50% - - W07-1601

40% ■ W09-1095 30% - -W 10-2749 20% -

10% - 0%

(A) Percentage of target regions with at least n fold average coverage, showing that for each DNA sample of the families more than 80% of the targets is covered more than 10 fold. (B) Percentage of target regions on the X-chromosome with at least n fold average coverage for cases with potential X-linked inheritance, showing that for each sample more than 80% of the targets is covered more than 10 fold.

60 PATHOGENIC ALLELES IN SIBUNGS WITH ID

Pathogenic mutations in SLC9A6, SLC6A8 and DDHD2 In two of the four brother pair families, we identified pathogenic mutations in known X-chromosomal ID genes, SLC9A6 and SLC6A8. In SLC9A6 (OMIM: 300243220), a c.1639G>T mutation, resulting in premature stop codon p.Glu547*, was identified in two brothers (W08-0135) with severe ID, a friendly personality, microcephaly, epilepsy and ataxie gait. Sanger validation confirmed the presence of the mutation hemizygously in the two brothers and in heterozygous state in their carrier mother. The observed Angelman-like phenotype in family W08-0135 is very similar to the phenotype previously described in families with mutations in SLC9A6220 underlining pathogenicity of this change. In SLC6A8 (OMIM: 300352221), we detected a c.1005_1007delCAA deletion resulting in a single amino acid residue deletion, p.Asn336del, within the sodium neurotransmitter symporter domain of the protein. This same mutation has been classified as pathogenic in previous studies222 and has been proven to disrupt the transporter function of SLC6A8 in vivo223. Family W10-2749 of Dutch origin, consisted of three affected brothers of whom one passed away at 13 years of age. All three brothers presented with severe ID and epilepsy. Facial asymmetry was noted in the two brothers. Additional measurement of creatine and creatinine levels in urine of both males showed elevated creatine concentration (6459 and 3227 fxmol/Lfor individual 7 and Figure 4D, respectively) and an elevated creatine/creatinine ratio (2084 and 1796 ^mol/mmol creatinine for individual 7 and 8, respectively), thereby confirming the molecular diagnosis on metabolite level. Of note, manual selection of the sequence reads of the g.152958810_152958812delCAA variant showed the deletion only in 5 of 29 reads, which is unexpected, since males have only one X chromosome. However, SLC6A8 is located in a segmental duplication region on Xq28 that shows >90% sequence identity with two segmental duplication regions on chromosome 16p11.2 (hg19, chr16: 32,872,610-32,899,081 and chr16: 33,776,247-33,802,719). Given this high sequence homology, we suspected that either the wildtype or the mutation reads were mapped incorrectly. Sanger sequencing with specific primers for SLC6A8 confirmed the hemizygous presence of this three basepair deletion in the two affected brothers that are alive and heterozygous in their carrier mother. This example, where sequence homology interferes with correct mapping of sequence reads highlights one of the challenges of exome data interpretation. Pathogenic compound heterozygous frameshift mutations, c.1804_1805insT and c.2057delA, resulting in p.Thr602llefs*18and p.Glu686Glyfs*35, were identified in DDHD2 in a brother and sister of a Dutch-Philippine family (W10-1338). The phenotype consisted of ID, spastic paraplegia as well as a thin corpus callosum and subtle periventricular white matter hyperintensities on cerebral imaging. DDHD2 is one of three mammalian intracellular phospholipase Ai enzymes and is involved in organelle biogenesis and intracellular trafficking224.225, a function that is

61 CHAPTER 3

Table 3 Genes with (candidate) mutations

Gene ID Family ID NM number cDNA level change Protein level change

Truncating and pathogenic mutations

DDHD2 W10-1338 NM_015214.2 c.1804_1805insT; p.Thr602llefsX18; c.2057delA p.Glu686Glyfs*35

SLC6A8 W10-2749 NM 005929.3 c.1005_1007delCAA p.Asn336del

SLC9A6 W08-0135 NM_001042537.1 c.1639G>T p.Glu547*

TDP2 W09-1095 NM_016614.2 c.425+1G>A p.Tyr84*, p.Leu142fs*, p.Gly135fs* 16 a

Missense mutations classified as potentialiy pathogenic

BC0RL1 W07-1601 NM 021946.4 c.2459A>G p.Asn820Ser

MCM3AP W05-385 NM 003906.3 c.2743G>A p.Glu915Lys

PTPRT W09-1109 NMJ33170.3 c.4094C>T; p.Thr1365Met; .arr snp 20q12q13.11(SNP_A-2168377->SNP_A-41944

SYNE1 W10-1137 N M J82961.2 c.1964A>G; p.Gln655Arg; c.9262G>A; p.Ala3088Thr; c.11675T>C p.Leu3892Ser

ZNF582 W11-3472 NM J44690.1 c.193T>G; p.Trp65Gly; c.1034G>A p.Gly345Glu

Summary of (candidate) ID genes and their respective mutations that were identified in the sibling families. Upper section: truncaling and/or pathogenic mutations. Mutations were classified as pathogenic (+) whenever they resided either in a known ID gene (OMIM). or in a novel gene in which a second pathogenic mutation was identified in a palient with a similar phenotype upon further study. Lower section: potenlially pathogenic missense mutations in genes with a functional link to ID. Mutations were classified as potenlially pathogenic (+/-) if they fulfilled all four following criteria. (i) a neurologie or no phenotype related to the respective gene in the online Mendelian inheritance in man database (OMIM; www.ncbi.nlm.nih.gov/OMIM/) (OMIM number is given whenever present). (ii) mRNA expression in brain/neuronal lissue according to the expressed sequence tags dalabase (www.ncbi.nlm.nih.gov/dbEST/), (E). (iii) mutation effect on protein level is predicled as disease causing/damaging by eilher SNPs&Go218 and/or PolyPhen-22’9 (P), and lastly. (iv) the altered amino acid is conserved amongst vertebrates (C). Furthermore, we have indicaled whelher a mutalion is predicled lo result in nonsense mediated mRNA decay and/or truncated protein (T), and whether the mutation(s) reside in functional domain of the protein (D). In addition, Ihe GO-terminology for biological processes is given.

apredicted effecl of splice donor site mutalion Abbreviation: XL-ID X-linked intellectual disabilily.

62 PATHOGENIC ALLELES IN SIBLINGS WITH ID

Mutation Supporting Phylo-P Go terminology class evidence

+ 2nd family, 3.9 Lipid catabolic process; intracellular protein transport E, T, D 3.0

+ MIM300352 4.2 Creatine metabolic process; muscle contraction (XL-ID), E, T, D

+ MIM300243 3.1 Amino-acid transport; neurotransmitter transport (XL-ID), E, T, D

+/- E, C, T, D 5.7 Cellular surface receptor signaling pathway; double- strand break repair

+/- E, P, C 3.1 Regulation of transcription; chromatin modification

+/- E, P, C 6.0 DNA replication; protein import into nucleus

+/- E, P, C, D 3.5 Protein tyrosine phosphatase activity

+/- MIM612998, 4.8 Golgi organization; cell death; cytoskeletal & nuclear MIM610743, 5.8 matrix anchoring at nuclear membrane; muscle cell E, P, C, D 4.8 differentiation

+/- E, P, C, D 1.4 DNA dependent regulation of transcription 3.6

63 CHAPTER 3

Table 4 Segregating rare recessive missense variants and indel variants in 20 sibling families

Gene Family ID Transcript Mutation Mutation on Mut Supporting cDNA level protein level class evidence ABCB11 W10-1338 NM 003742.2 c.1244G>A; p.Arg415Gln; - E c.3044G>A p.Ser1015Asn

AKAP4 W07-1601 NM_003886.2 c.809T>C p.lle270Thr E. D

BC0RL1 W07-1601 NM 021946.4 c.2459A>G p.Asn820Ser +/- E, P, C

COL6A2 W05-385 NM 058174.2 c.2679G>A p.Met893ile E, D

DAXX W09-1095 NM_001141969.1 c.1457C>G p.Ala486Gly E, D

FAM58A W07-1601 NM 1552274.3 c.338G>A p.Arg113His E

MAN2A1 W06-0984 NMJD02372.2 c.3075_3077delCTC p.Ser1026del - E, D

MAP7 W06-0984 NM_001198608.1 c.1909G>A p.Ala637Thr - E

MCM3AP W05-385 NM_003906.3 c.2743G>A p.Glu915Lys +/- E, P, C

MLL3 W08-0748 NM 170606.2 c.11020G>C ; p.Gly3674Arg; - E c.12655C>G p.Leu4219Val

MLL5 W11-3472 NM 182931.2 c.4220A>G; p.Asn1407Ser; - E, C c.5350C>T p.Pro1784Ser

PAMR1 W05-385 NM 015430.2 c.1637G>A; p.Arg455Gln; - E, C, D c.1364G>A p.Arg546Gln

PTPRT W09-1109 NM 133170.3 c.4094C>T; p.Thr1365Met; +/- E, P, C, D

.arr snp 20q12q13.11(SNP_A-2168377->SNP_A-4194425)x1

64 PATHOGENIC ALLELES IN SIBLINGS WITH ID

PhyloP SNPs&Go PolyPhen-2 Gene function OMIM phenotype 2.8; 1.0 neutral 3; 0.03 benign; Transport. OMIM: neutral 2 0.87 possibly 605479; damaging OMIM: 601847

2.2 neutral 0 1.00 probably Sperm motility; single fertilization; cell - damaging motility.

3.1 neutral 1 0.99 probably Regulation of transcription; chromatin - damaging modification. 1.6 neutral 6 0.44 possibly Cell-cell adhesion; extracellular matrix OMIM: damaging organization. 158810; OMIM: 255600; OMIM: 254090

-0.1 neutral 7 0.03 benign DNA dependent regulation of transcription; - apoptotic process; cytokinesis after mitosis. 0.36 neutral 9 0.06 benign DNA dependent regulation of transcription; OMIM: regulation of cyclin-dependent protein kinase 300707 activity.

1.0 -- Mannose metabolic process; N-glycan - processing 1.1 neutral 6 0.01 benign Microtubule cytoskeleton organization and biogenesis; Establishment and/or maintenance of cell polarity.

6.0 neutral 1 0.92 probably DNA replication; protein import into nucleus. - damaging 0.1; 0.59 neutral 9; 0.57 possibly Regulation of transcription; intracellular neutral 9 damaging; signal transduction. 0.03 benign 0.1; Protein not 0.001 benign, DNA methylation, chromatin modification 3.8 in uniprot 1.00 probably release damaging 5.6; 3.4 neutral 8; 0.99 probably Proteolysis. neutral 6 damaging; 0.29 benign 3.5 Protein not 0.87 probably Transmembrane receptor protein tyrosine in uniprot damaging kinase signaling pathway release

65 CHAPTER 3

Table 4 Continued

Gene Family ID Transcript Mutation Mutation on Mut Supporting cDNA level protein level class evidence

SYNE1 W10-1137 NM 182961.2 c.1964A>G; p.Gln655Arg; +/- MIM612998, c.9262G>A; p.Ala3088Thr; MIM610743.E, c.11675T>C p.Leu3892Ser P, C, D

TDP2 W09-1095 NM_016614.2 c.919T>C p.lle307Val E, D

ZNF193 W09-1095 NM 001199479.1 c.914A>G p.His305Arg - E, P, D

Summary of all segregating rare recessive missense variants in the sibling families with their classification of pathogenicity: (+/-) lor potenlially pathogenic and (-) lor likely benign Mutations were classified as potentially pathogenic if they fulfilled all the following criteria: (i) no disease phenotype (OMIM) for respective gene. or, neurologie phenotype (OMIM). but ID not reported as leature so far (column supporting evidence: OMIM number is given whenever present), (ii) mRNA expression in brain/neuronal tissue according to the expressed sequence tags database (www.ncbi.nlm.nih.gov/dbEST/), (column supporting evidence: E), (iii) effect of mutation on protein level is predicted as disease causing/

Table 5 Homozygous deletions identified by exome sequencing

Family ID Z-score CN Chr Start position End position Size (bp)

W09-1109 -5.3 0 chr19 52146538 52149353 2,816

W09-2166 -2.7 0 chr7 100331757 100336279 4,523

W10-1134 -4.4 0 chr5 180375743 180431039 55,297

Exome copy number analysis was performed by use of cn.MOPS215. Samples were analyzed with a reference set of 30 exomes for comparison. Threshold for detection of homozygous deletions was set al Z-score < -2.5. Validation of homozygous deletions was done by genomic qPCR as previously described by de Leeuw el al243. Start and end positions are according to hg19. The structural variation column shows number of reports in the Database of Genomic Variation233 that describe a deletion that encompasses our deleted region. • = deletion does not segregate with the disease in the family as determined by genomic qPCR. CN predicted copy number by cn.MOPS.

6 6 PATHOGENIC ALLELES IN SIBLINGS WITH ID

PhyloP SNPs&Go PolyPhen-2 Gene function OMIM phenotype

4.8; 5.8; Neutral 4; 1.00 probably Golgi organization; cell death; cytoskeletal OMIM 4.8 neutral 9; damaging; & nuclear matrix anchoring at nuclear 612998; neutral 9 0.99 probably membrane; muscle cell differentiation. OMIM damaging; 610743 0.90 possibly damaging;

2.8 neutral 9 0.01 benign Cellular surface receptor signaling pathway; double-strand break repair

0.8 neutral 5 0.98 probably Sequence-specific DNA binding transcription damaging factor activity.

damaging by either SNPs&Go218 and/or PolyPhen-2219 (column supporting evidence. P), and lastly, (iv) the altered amino acid is conserved (at least) amongst vertebrates (column supporting evidence: C). Mulations that did not fulfill ihese criteria were classified as likely benign The column supporting evidence indicates furlhermore if the mutation(s) reside in funclional domain of the prolein (indicated by D). For gene funclion: GO-terminology for biological processes is given. The column OMIM phenotype lists OMIM numbers in human disease wilhout neurologie features. Abbrevialions: XL-ID X-linked intellectual disability.

Gene count Gene name Structural variation Genomic qPCR Segregation

1 SIGLEC14 13 Homozygous

1 ZAN 16 Homozygous

2 BTNL3 14 Homozygous

67 CHAPTER 3

Figure 3 Validation and segregation testing of homozygous deletions

Chromosome 19 W09-1109 E>

W09-1109 Individual 3 éiiaó'MM 1 2 3 \ 4 5 6 7 8

Control

Control

SIGLEC5 SIGLEC14

Chromosome 7 H-

W09-2166 Individual 2 *1 2 \

Control

t , L a v X % %, % % % ZAN \ \ \

Chromosome 5 W10-1134 EH - 0

W10-1134 Individual 2

Control

Validation of homozygous deletions was done by genomic qPCR as previously described by de Leeuw et al.243 (A) Exome sequencing coverage for the proband of W09-1109 (arrow, individual 3) is shown as compared to two control samples and indicates a homozygous deletion of the last exons of SIGLEC14. The homozygous deletion is confirmed by genomic qPCR in the proband, but absent in his affected siblings (individuals 1 and 2). (B) Exome sequencing coverage for the proband of W09-2166 (arrow, individual 2) is

68 PATHOGENIC ALLELES IN SIBLINGS WITH iD

shown as compared to two control samples and indicates a homozygous deletion of the first exons of ZAN. The homozygous deletion is oonfirmed by genomic qPCR in the proband, but is heterozygous present in affected individual 1. (C) Exome sequencing coverage for the proband of W10-1134 (arrow, individual 2) is shown as compared to two control samples and indicates a homozygous deletion of the first exons of BTNL3. The homozygous deletion is confirmed by genomic qPCR in the proband, but does not segregate with the ID phenotype in individual 1 and 3. shared among several genes that have been implicated in clinically similar complex phenotypes with spasticity and ID: such as SPG11 (SPG11, OMIM 604360), ZFYVE26 (SPG15, OMIM 270700) and AP4B1 (SPG47, OMIM 607245)226'228. Pathogenicity of the DDHD2 mutations was supported by follow-up studies that resulted in the identification of recessive pathogenic mutations in three additional families with a comparable clinical presentation (reported in a separate study by Schuurs-Hoeijmakers et al. in press) and highlight the value of detailed clinical phenotyping for selection of follow-up cohorts for variant interpretation.

Potentially pathogenic mutations in BCORL1, MCM3AP, PTPRT, SYNE1, TDP2 and ZNF528 In BCORL1 on the X-chromosome, a potentially pathogenic unique hemizygous missense variant, c.2459A>G, resulting in p.Asn820Ser, was identified in a brotherpair (W07-1601) with severe ID, coarse face and hypotonia. The asparagine at position 820 is conserved among vertebrates. Of note, there were no unique autosomal variants that segregated in both boys, BCORL1 functions as a transcriptional corepressor and interacts with class II histone acetyltransferases and deacetylases (HDAC), HDAC4, HDAC5 and HDAC7 and also with CtBP corepressor protein. Transcriptional repressors, such as BCOR (OMIM: 300166229), ARX (OMIM: 300419230), MECP2 (OMIM: 312750231), EHMT1 (OMIM: 61025354) and FOXG1 (OMIM: 613454232) have been implicated in ID syndromes. A homozygous potentially pathogenic variant c.2743G>A was identified in MCM3AP in a brother and sister with borderline to mild ID, progressive polyneuropathy and ptosis (W05-385). Parental DNA to test heterozygosity of this variant in the parents was not available. The homozygous missense variant alters an amino acid that is conserved among the vertebrates, p.Glu915Lys, and is predicted to be damaging to the protein structure by Polyphen-2219. MCM3AP encodes minichromosome maintenance 3 acetylating protein that binds and acetylates MCM3 and thereby inhibits cell cycle progression. A combination of a heterozygous missense variation c.4094C>T, resulting in p.Thr1365Met, and a heterozygous intronic deletion of 150kb (,arr snp 20q12q13.11 (SNP_A-2168377-> SNP_A-4194425)x1 mat) were identified within PTPRT in afam ily with three affected brothers and two affected sisters with a complex ID phenotype consisting of severe ID, behavioral problems, microcephaly, congenital cardiac defect and herniation of the abdominal diaphragm. PTPRT is expressed higher in

69 CHAPTER 3

Figure 4 Study design and pathogenic mutations in known and novel genes for intellectual disability -o B ____ M l/- Ó É É É 1 2 \ 3 4 1 2 \ Moderate ID, epilepsy, dental anomalies DDHD2: M1/M2 M1/M2 Thr Arg Ala Glu Ser Glu Lvs Ser Phe Tyr Gin Ser Jntron _ _ Trp Gly Leu Ly i Dhr 6 ▼ AAGTCT TTT ACCAGAGCT r T T A T AGG GAGTCTG AA

COCD I CC __

Genomic seq

I W09-1095

SLC9A6 : M Eh-O Stop Ser Asp Gin Glu His Leu Ser Asp Gin Glu His Leu TCAG*CCA*GAA CACTTG T CA G AC CAAG AA C A C T T G ó T i l 1 2 \ 3 TDP2: -/- M/M M/M

Inlrjn Leu Ala Leu T.. Intron ACTT ACAA * G C TAAG AA T A C T T

Control . I j 6 #t i .i

TDP2 - mutation interpretation l Exon 1 Exon 2 /* Exon 3 „e '1 ■ 1 Arg Phe Asn Aan O» Tit Arg Phe Asn Asn CiS T\r CGC T T CA * C A A C A A C T GC CGC T T C A * C A A C A A C T GC Gene expression profile

M Ê l t l A

(A) Study design: (i) family selection represented by the pedigree of family W09-1095, (ii) exome sequencing and variants selection resulting in candidate mutations, represented by the raw sequence reads of our candidate mutation, c.425+1G>A, in TDP2 and (iii) validation and segregation testing by

70 PATHOGENIC ALLELES IN SIBLINGS WITH ID

Sanger sequencing, depicted by the chromatograms showing the mutant sequence that was present in all afteoted male individuals and the wildtype sequence present in the unaftected sister and (iv) variant interpretation (see legend Table 1 for further details). (B) Pedigree of family W10-1338, showing segregation of the compound heterozygous mutations, c.1804_1805insT and c.2057delA in DDHD2. (C) Pedigree of family W08-0135, showing segregation of the hemizygous stop mutation, c.1639G>T, in SLC9A6. (D) Pedigree of family W10-2749, showing segregation of the hemizygous amino acid deletion, c.1005_1007delCAA, in SLC6A8. Arrows indicate the probands, M= mutation, M1 and M2 are the two different alleles of compound heterozygous mutations, ND= not determined, - = normal allele present. the central nervous system than in other tissues, such as bone marrow, kidney, and skin. PTPRT encodes a transmembrane receptor of the protein tyrosine phosphatase family, which are important proteins in signal transduction. The p.Thr1365Met substitution resides in the second and last protein tyrosine phosphatase catalytic domain of the protein and affects a threonine that is conserved in vertebrates. The deletion removes at least half of intron 1, whereas deletions of >10kb in intron 1 are not reported in healthy Controls233. In SYNE1 (OMIM: 610743, 612998), we identified three heterozygous missense variants in a Sicilian brother-sister pair (W10-1337) with mild ID, spastic paraplegia, axon neuropathy, leukoencephalopathy in both and a hypoplastic corpus callosum in the female. We confirmed the presence of three substitutions: c.1964A>G and c.9262G>A, resulting in p.Gln655Arg and p.Ala3088Thr, were both inherited from the father, and c.11675T>C, resulting in p.Leu3892Ser, was inherited from the mother. The paternal p.Gln655Arg and maternal p.Leu3892Ser substitutions are located in two of the multiple spectrin repeats of SYNE1, and the second paternal p.Ala3088Thr substitution is in close proximity (14 amino acids) of such a repeat. SYNE1 is one of the largest human genes consisting of 146 exons, with multiple transcripts, and a markedly high expression in human central nervous system234. It is part of the spectrin family of structural proteins that link the plasma membrane to the actin cytoskeleton. Studies in skeletal muscle, indicate a role for SYNE1 in motor neuron innervations235. Nonsense mutations in SYNE1 have been described in autosomal recessive spinocerebellar ataxia type 8 (OMIM: 612998) and splice site and missense mutations - the latter all located in, or in close proximity to, spectrin repeats - have been linked to autosomal dominant Emery-Dreifuss muscular dystrophy type 4 (OMIM: 610743). A homozygous splice site mutation has been reported in one family with autosomal recessive arthrogryposis236. All three SYNE1 variants in family W10-1137 were predicted as probably damaging by PolyPhen-2219 but predicted to be benign by SNPs&Go218 (Table 4). Taken together, the complete conservation of these amino acids within vertebrates, together with their localization within functional domains of the protein, the reported function of SYNE1 in at least peripheral neurons and the involvement in different neurodegenerative and neuromuscular phenotypes suggest that these mutations can be considered to be causal to at least a part of the phenotype.

71 C H A P T E R 3

Figure 5 Flow diagram showing step-by-step variant classification of segregating recessive sequence variants

Rare variants (non-synonymous, slice site and protein truncating) that segregated with the ID phenotype were classified as pathogenic if they resided either in a known ID gene, or in a novel gene in which a second pathogenic mutation was identified in a patiënt with a similar phenotype upon further study, Variants were labeled potentially pathogenic if they fulfilled four criteria: (i) the gene is already linked to a human neurologie phenotype other than ID or is not at all linked to a human phenotype as described in the online Mendelian inheritance in man database (OMIM; www.ncbi.nlm.nih.gov/OMIM/) (ii) there is mRNA expression of the respective gene in brain/neuronal tissue according to the expressed sequence tags database (www.ncbi. nlm.nih.gov/dbEST/), (iii) effect of missense variant(s) is predicted to be ‘disease causing/damaging’ by either SNPs&Go218 and/or PolyPhen-2219, and lastly, (iv) the altered amino acid is conserved amongst vertebrates. Variants that did not fulfill these four criteria where classified as likely benign.

72 PATHOGENIC ALLELES IN SIBLINGS WITH ID

Figure 6 Segregation of potentially pathogenic missense variants

B W07-1601 Ehr-O □ n r O ié ÉÉ 1 2 \ / \ MCM3AP : IMA M/M BCORL1 : M

GTCGGTGGCCTCTTCACA AGGTTTCCCACTGACTGA

GGAGGGGGGCGTGTr~"G

èÉÉóó 1 jétjéulj 1 2 3 \ 4 5 6 7 8 J jiy H ' PTPR T: M/Del W/Del M/Del -/Del M/-

Chromosome 20

E t - j - O M1/-/M3 -/M2J- M1/- I -/M2 éÉ éé ^ 1 1 SVWfJ : M1/M2/M3 W1/M2/W3 ZNF582 . M1,W2 M 1/1.12

AGTATGCTGCTGAATGGA AAAACACTTGGCGGTAGT TTTGTCCTTTAA/.GTGACGÏC fiAAGACvnCCTGGATGGTG flAGGAATGTGGGAAGGCT

Patiënt

(A) Pedigree of family W05-385 with Sanger confirmation of homozygous variant c.2743G>A (M) in MCM3AP. (B) Pedigree of family W07-1601 with Sanger confirmation of the hemizygous variant c.2459A>G (M) in BCORL1. (C) Pedigree of family W09-1109 with Sanger confirmation of variant c.4094C> in PTPRT. Individuals in gray have borderline intellectual functioning. Bottom panel shows the 150kb deletion on chromosome 20 in the proband, .arr snp 20q12q13.11(SNP_A-2168377->SNP_A-4194425)x1, detected with 250k SNP array testing. (D) Pedigree of family W10-1137 with sanger confirmation of variants, c.1964A>G (M1), c.9262G>A (M2) and c.11675T>C (M3) in SYNE1. (E) Pedigree of family W11-3472 with heterozygous variants c.193T>G (M1) and c.1034G>A (M2) in ZNF582.

73 CHAPTER3

Figure 7 Predicted effect of c.425+1G>A mutation on mRNA splicing

Wildtype situation

Exon 1 / ’> Exon 2 Exon 3 Exon 4 Exon 7

E *

Scenario A - intron 3 retention I G to A mutation Exon 1 / ’> Exon 2 \ Exon 3 Exon 4 Exon 7 ZPV— SI Slop Scenario B - exon 3 skipping

Exon 1 / > Exon 2 Exon 4 Exon 7 ! A A C T A _ KV-— Slop Scenario C - use of alternative splice donor site

Slop

Scenario A will lead to exon 3 skipping, thereby introducing a premature stop codon p.Tyr84* on protein level, scenario B leads to intron 3 retention, on protein level resulting in a premature stop codon p.Leu142fs*, and scenario C shows an alternative splice donor site usage causing a shift of the reading frame, p.Gly135fs*16,

In TDP2, a homozygous canonical splice donor site mutation, c.425+1G>A, was detected in three affected brothers from an Irish family (W09-1095). This canonical splice donor site mutation resides in a 20 Mb homozygous stretch on chromosome 6 as found by 250k SNP analysis and was shared by all affected individuals but not by their unaffected sister (data not shown). This mutation will likely result in aberrant mRNA splicing either by (i) exon 3 skipping, thereby introducing a premature stop codon p.Tyr84*, (ii) intron 3 retention, leading to a premature stop codon p.Leu142fs*, or (iii) alternative splice donor site usage resulting in a shift of the reading frame, p.Gly135fs*16 (Figure 7). All three men in this family presented with moderate ID, epilepsy, facial dysmorphisms and mild dental anomalies. Besides this splice donor site mutation, three homozygous rare missense variants were identified in TDP2, ZNF193 and DAXX, in the same homozygous stretch on chromosome 6 and subsequently segregated with the disease. All three missense variants are likely to be benign variations: they have low amino acid conservation among species and are predicted not to affect protein function by both SNPs&Go218'237 and PolyPhen-2219

74 PATHOGENIC ALLELES IN SIBLINGS WITH 10

(Table 4). TDP2, a member of the Mg2+/Mn2+-dependent family of phospho- diesterases238, is an enzyme that cleaves 5’-phosphotyrosyi bonds and is invoived in DNA repair of 5’ doublé stand breaks. Germline mutations in other DNA repair genes have been implicated in neurodevelopmental and neurodegenerative phenotypes (ERCC8, OMIM:216400; ERCC6, OMIM: 133540; ERCC2, ERCC3, GTF2H5, OMIM:601675). Furthermore, TDP2 is thought to be invoived in transcrip­ tional regulation and signal transduction239. The nature of this mutation in combination with the function of TDP2 makes it a strong candidate for the observed phenotype in this family. Lastly, in another transcriptional repressor, ZNF528, compound heterozygous missense variants c.193T>G and c.1034G>A, resulting in p.Giy345Glu and p.Trp65Gly, were identified in two sisters with mild ID and an eye movement disorder (W11-3472). ZNF582 is located within a zinc-finger cluster on chromosome 19q13.43 and belongs to the family of Kruppel-associated box (KRAB)-containing zinc finger proteins. ZNF582 harbors nine zinc finger domains that are essential for DNA binding. The p.Gly345Glu substitution is positioned next to the second cystines of the C2H2-zinc ion binding motif of the sixth zinc finger domain. The p.Trp65Gly substitution is located in a conserved amino acid sequence of the Kruppel-associated box. Several members of the large KRAB-containing zinc protein family have been implicated in X-linked ID (ZNF41, OMIM: 300848240; ZNF81; OMIM: 300498241, and ZNF674, OMIM: 300851242). For both transcriptional repressors, BCORL1 and ZNF582, the combination of the mutation characteristics, the gene function, and gene family members that already have been implicated in ID, supports that they might be invoived in ID pathology.

Families without candidate mutations In eleven sibling families we had either no segregating variants or segregating variants in genes that are unlikely candidates to explain the ID phenotype (Table 4). This might originate from technical limitations, i.e. the mutations are located outside the regions captured in our exome experiment, or they are captured but poorly sequenced and/or difficult to map. Also in some of the families there might be a different inheritance model than we have anticipated, in which case one of the parents could be a carrier of a (germline) mosaicism, or possibly there is a more complex (digenic or multigenic) inheritance pattern.

75 CHAPTER 3

Dïscussion

In 20 sibling families, we identified pathogenic mutations in one novel, two known and six candidate ID genes. The diagnostic yield of variants that could convincingly be classified as pathogenic is three out of twenty families in this relative small sample size. Two variations that were classified as pathogenic were identified in two out of four affected brother pair families in X-chromosomal ID genes and mutations in a third autosomal gene, DDHD2, were identified in an affected brother-sister pair. This is to be expected as X-chromosomal mutations will considerably contribute to disease in families with only affected male siblings105. Interpretation of variants on the X-chromosome is relatively straight forward because, in contrast to ARID genes, the majority of X-chromosomal ID genes has already been discovered. The diagnostic yield in brother-sister and sister-sister pairs will presumably improve when more autosomal recessive ID genes will be uncovered. Potentially pathogenic mutations were found in six genes that have not previously been linked to ID. Further clinical and molecular studies are necessary to provide conclusive evidence for involvement of these genes in ID. Given the heterogenic nature of ID, there is a need for systematic storage of combined clinical and exome data worldwide to make interpretation of these candidate mutations feasible. In our study, on average 13 potentially recessive rare variants needed follow-up by Sanger sequencing per sample after variant selection, as well as four potentially hemizygous rare variants on the X chromosome in the brother pair families. Segregation testing in combination with variant interpretation and subsequent variant classification reduced this number to zero to one candidate ID gene per family. These results are comparable to studies that use family-based exome sequencing in isolated ID cases (e.g. three exomes for patients and parents respectively) to identify zero to two de novo candidate mutations115. 134 and to studies that study large consanguineous families with a combined approach of homozygosity mapping and targeted -but not genome-wide- massive parallel sequencing to identify homozygous candidate mutations110. While with the current strategy, using an unbiased, genome-wide approach, only one exome per family needed to be studied. This indicates that our sibling approach, focusing on rare recessive sequence variants in the coding sequence of the human genome, results in a manageable number of candidate mutations for follow-up studies, such as collaborative screening of large cohorts of patients with ID for additional mutations, (cell biological) assays proving impairment of protein function and/or animal studies that support a role for these candidate genes in intellectual functioning. In conclusion, we identified potential pathogenic mutations in nine out of 20 families (45%). These results demonstrate that recessive mutations can readily be identified

76 PATHOGENIC ALLELES IN SIBLINGS WITH ID

in small sibling families, which not only discloses this, previously less accessible, group of ID patients for identification of novel autosomal recessive and X-linked ID genes, but also offers their families, with a high recurrence risk of 25%, the possibility to obtain a molecular diagnosis, genetic counseling and prenatal diagnosis.

Acknowledgements We are grateful to the patients and their families for their support and cooperation. We thank Dr. N. de Leeuw for her assistance with whole genome copy number and genotyping analysis, Dr. G. Salomons for SLC6A8 mutation analysis and Z. Iqbal for discussion and support. This work was funded in part by grants from the Dutch Organization for Health Research and Development (917-86-319 to BBAdV, 911-08-0205 to JAV), the EU-funded GENCODYS project (EU-7‘f>-2010-241995 to HvB and BBAdV) the Dutch Brain Foundation (2010(1)-30 to APMdB, 2009(1)-22 to BBAdV), the Italian Ministry of Health (Ricerca Corrente 2012 entitled "Le malattie genetiche con ritardo mentale” to CR, PF, DG, MF, OG) and “5 per mille” funding to CR, PF, DG, MF, OG.

77

Mutations in DDHD2, encoding an intracellular phospholipase cause a recessive form of complex Hereditary Spastic Paraplegia

(American Journal of Human Genetics, in press)

Janneke H.M Schuurs-Hoeijmakers' 3'. Michael T Geraghty-*'. Erik-Jan Kamsteeg’ ’. Salma Ben-Salem5. Susanne T. de Bot6-7, Bonnie Nijhof,•3•^ Ilse I.G.M van de Vondervoort'. Marinetle van der Graaf8, AnnaCastells Nobau1-3-7, Irene Otte-Höller9. Sascha Vermeer', Amanda C. Smilh10. Peter Humphreys-1. Jeremy Schwartzentruber", FORGE Canada Consortium12. Bassam R Ali5. Saeed A. Al-Yahyaee13, Said Tariq' V Thachillath Pramathan5. Riad Bayoumi15, Hubertus P.H. Kremer'6, Bart P van de Warrenburg6 7 Willem M R van den Akker', Christian Gilissen'3. Joris A Veltman' 3 Irene M. Janssen', Anneke T Vulto-van Silthout'7 Saskia van der Velde-Visser'. Dirk J. Lefeber26'7, Adinda Diekstra’, Corrie E Erasmus'8, MichèlA Willemsen7'8. Lisenka E.L.M. Vissers'-3, Martin Lammens9. Hans van Bokhoven'37 Han G. Brunner1’3. Ron A Wevers2 '5'9. Annette Schenck13-7. Lihadh Al-Gazali5', Bert B A de Vries'-2-7' Arjan P M . de Brouwer1 3-7'

' first authors confribuled equally, * last authors contribuled equally

' Department of Human Genetcs (855). Radboud Urwersity Njmegen Medcal Cenire PC Box 9101 6500 HB Nijmegen. The Nethertands. ? InsMute for Genetic and Metabohc Disease, RadtxxxJ University Wjmegen Medical Centre. 6500 HB Nijmegen. The Nethertands 3 Nijmegen Centre for Molecuiar Lrfe Sciences. Radboud University Nijmegen Medical Centre. 6500 HB Nijmegen. The Nethertands. 4 Department of Pediatncs, Chiidren's Hospital of Easiem Ontano. Unrvefsity of Ottawa, K1H 8L1 Ontario, Canada Departments of Pathotogy and Paediatncs. Faculty of Mediorte and Heaith Sciences, United Arab Emirates University, PO Box 17666. AI-Aht United Arab Errwates • Department of Neurology Radboud University Nijmegen Medical Centre, 6500 HB Nijmegen. The Netherlands. ' Donders Instiiute for Bram, Coanition and Behavtour Radboud Urwersity Nijmegen, 6500 HE Nijmegen, The Netherlands 8 Departments of Pediatncs and Radiotogy. Radboud University Nt|megen Medical Centre, 6500 HB Nijmegen. The Netherlands 9 Department of Pathology, Radboud University Nijmegen Medical Centre. 6500 HB N.jmegen, The Netherlands 10 OHRI, University of Ottawa, K1H 8L1 Ontario, Canada 11 McGill University and Genome Quebec Innovation centre, H3A1A4 Montreai. Ouebec, Canada 12 FORGE Steering Committee Membership is listed n Acknowledgements 13 Department of Genetics, College of Medicine and Health Soences. Smtan Qaboos Uraersity POBox 35 Al Khod. Muscat 123, Oman 14 Departments of Anatomy, Faculty of Medicine and Heaith Sciences. United Arab Emirates Urwersity. PO Box 17666. Al-Am, United Arab Emirates. 15 Department of Biochemistry, College of Medicine and Health Sciences, Sultan Qaboos University. POBox 35 Al Khod, Muscat 123, Oman 16 Department of Neurology, University of Groningen, University Medical Center Gronmgen, 9700 RB Groningen, the Netherlands 17 Department of Laboratory Medicine, Radboud University Nijmegen Medical Centre. 6500 HB Nijmegen, The Netherlands 18 Department of Pediatrie Neurology, Radboud University Nijmegen Medical Centre. 6500 HB Nijmegen, The Netherlands. 13 Laboratory of Genetic, Endocrine and Metabolic Diseases, Radboud University Nijmegen Medical Centre, 6500 HB Nijmegen The Netherlands. CHAPTER 4

Abstract

We report on four families with a clinical presentation of complex hereditary spastic paraplegia (HSP) due to mutations in DDHD2, one of the three mammalian intracellular phospholipases Ai (iPLAi). The core phenotype of this new HSP syndrome consists of very early onset (<2 years) spastic paraplegia, intellectual disability (ID), and a specific pattern of brain abnormalities on cerebral imaging. An essential role for DDHD2 in the human central nervous system and perhaps more specifically in synaptic functioning is supported by a reduced number of active zones at synaptic terminals in Ddhd knockdown Drosophila models. All identified mutations affect the DDHD domain of the protein that is vital for its phospholipase activity. In line with the function of DDHD2 in lipid metabolism and its role in the central nervous system, an abnormal lipid peak indicating accumulation of lipids was detected with cerebral MR spectroscopy, which provides an applicable diagnostic biomarker that can distinguish the DDHD2-phenotype from other complex HSP phenotypes. We show that mutations in DDHD2 cause a specific complex HSP subtype (SPG54), thereby linking a member of the PLAi family to human neurologie disease.

80 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Hereditary spastic paraplegias (HSP) are a genetically heterogeneous group of neurodegenerative disorders with as common feature a length-dependent, distal axonopathy of fibers of the corticospinal tract, which clinically leads to lower limb spasticity and weakness. In pure HSP, symptoms are more or less restricted to corticospinal tract dysfunction, whereas in complex HSP the disease is more widespread and encompasses additional neurologie and non-neurologic symptoms. Inheritance can follow an X-linked, dominant and autosomal recessive pattern. In 1986 the first HSP locus was described244 and to date over 50 loei have been mapped (SPG1-53). In these, 29 causative genes have been identified (Schule et al,245 and OMIM July 2012). The proteins involved have diverse functions, of which the most prominent are regulation of intracellular trafficking, organelle biogenesis and shaping of membranes226. For autosomal recessive HSP in combination with a thin corpus callosum and white matter abnormalities five genes have been identified: SPG11 (—20% of autosomal recessive HSP, [MIM 604360])245, ZFYVE26 (SPG15, <3% of autosomal recessive HSP, [MIM 270700])245, SPG21 (old order Amish population, [MIM 248900])246, GJC2 (SPG44, one family reported, [MIM 613206])247 and AP4B1 (SPG47, two families reported, [MIM 607245])136' 248. We ascertained two families with a complex form of HSP consisting of a combination of early onset spastic paraplegia and intellectual disability (ID), characterized by a marked thin corpus callosum and subtle periventricular white matter hyperintensities on MRI, and an abnormal lipid peak on cerebral proton MR spectroscopy. Exome sequencing was performed on genomic DNA of the affected boy of family 1 and on both siblings and the father from family 2 (Figure 1). For families 1 and 2, genomic DNA was captured with an Agilent SureSelect Human All Exon 50 Mb Kit (Agilent Technologies, Santa Clara, CA, USA). Captured DNA of family 1 was sequenced on one-fourth of a sequence slide on a SOLiD™ 4 System and aligned to the reference human genome (hg19) (Life Technologies, Carlsbad, CA, USA). Exome sequencing experiment and analysis pipeline are previously described by Vissers et al115. For family 2, captured DNA was sequenced on lllumina HiSeq 2000 according to the manufacturer’s protocol (lllumina Inc., San Diego, CA, USA). Fastx toolkit was used to remove adaptor sequences and to trim sequence reads. Burrows-Wheeler Aligner 0.5.9249 and Genome Analysis Toolkit250 were used to align reads to the reference human genome (hg19). Single nucleotide variants and indels were called using SAMtools251 and annotated using ANNOVAR252 and custom scripts. For both families, variant prioritization (including indel variations) excluded all variants present in less than 20% of reads as well as all non-genic, intronic (otherthan canonical splice sites) variants and variants leading to synonymous amino acid changes. Next, variants with a frequency of >1% in dbSNPv134 or the Nijmegen local variant database (~ 270 exome sequencing experiments) were excluded for family 1. For family 2, variants present in dbSNPv132

81 CHAPTER 4

Figure 1 Pedigrees of families 1 to 4 with photographs of the face and chromatograms showing Sanger confirmation of the mutations in DDHD2 (NM 015214.2)

Family 1 Family 2 -o -o 1 \ 2 -,'M2

M1/M2 M1/M2 M1/M2 M1 = p.Thr602llefs*18 M1 = p.lle463Hisfs*6 M2 = p.Glu686Glyfs*35 M2 = p.Asp660His

Thr Arg Ala Glu Sei GKj lle Pio Ly3 Ly-, Se; Phe Tyi Gin Sec ___Intron Tip Gly Leu L>s Gin Ala Asn hts pro Gin Gm Atg Ib _Jfe _T>(_ .al A A GT C IT T T A CC AG A G C T CTTT A T A GG G4G TCT GA’ GGGS C» AAC1 TC CCC AA» C AA CGC ATT G*C TAT GTG T AC C A G* GC GT CTG * *G

Family 3

D~rO26 l 25

Q t Ó D t Ó 22 I 21 24 I 23

M/M

1 20 9 2 M = p.Arg287* M/- M/- M /- M/ I èÉóófóèÉüó 3 4 6-8 5 15 1 0 1 2 13 14 16-19 M/M M/M M/• Ml- M /M M/M ‘ó'èï öBD 27 28 29 30 31-34 35 36 37 M/- M/M M/- M/- -/- M = p.Arg516*

82 JTATIONS iN DDHD2 CAUSE COMPLEX HSP

Figure 1 Continued

E p.Thr602llefs*18 p.lle463Hisfs*6 p.Asp660His p.Arg287* p.Arg516* p.Glu686Glyfs*35 i 349 353 495 | | 700 N “ — WWE > - — ------t ---- SAM I i DDHD » C 0 30 112 T 383 445 711 Lipase domain

Arrows indicate the individuals on whom exome sequencing was performed. M= mutant allele, - = wildtype allele, C= control. (A) Family 1 (family identifier: W10-1338), with an affected sister and brother and compound heterozygous frameshift mutations, c.1804J805insT (p.Thr602llefs*18) and c.2057delA (p.Glu686Glyfs*35). (B) Family 2, with an affected sister and brother and compound heterozygous frameshift and missense mutation, c.1386dupC (p.lle463Hisfs*6) and c.1978G>C (p.Asp660His). (C) Consanguineous family 3, with seven affected individuals and a homozygous mutation c,1546C>T (p.Arg516*). Pedigree numbering is according to the original pedigree by Al-Yahyaee et a l253 (D) Consanguineous family 4 (family identifier: W12-0041) with one affected male Individual and a homozygous mutation c.859C>T (p.Arg287*). (E) The protein structure of DDHD2 including its four domains (WWE, lipase, SAM, and DDHD domain) with the position of all identified mutations indicated, or control samples from exome experiments sequenced through the FORGE project (-400 exome sequencing experiments) were excluded. Candidate mutations were selected under the assumption of an autosomal recessive disease model (Table 1 and 2). There was overlap for candidate mutations in one gene only, DDHD2 (NM_015214.2), between both families. For family 1, variant selection resulted in candidate recessive mutations in seven genes, of which the most striking ones were two heterozygous mutations in DDHD2 as both mutations result in a shift of the open reading frame. In family 2, compound heterozygous mutations in only one single gene, DDHD2, fitted a recessive model after comparison of variants of all three family members (Figure 2). Sanger sequencing showed that the respective mutations segregated with the phenotype in both families (Figure 1 A,B). In family 1, compound heterozygous frameshift mutations c.1804_1805insT and c.2057delA were detected in DDHD2 (Figure 1A). Both mutations result on protein level in a premature termination codon (PTC), p.Thr602llefs*18 and p.Glu686- Glyfs*35, within the DDHD-domain which is located at the C-terminal end of the protein (Figure 1E). In family 2, a compound heterozygous , c.1386dupC, and a missense mutation, c.1978G>C, were identified (Figure 1B,E). The frameshift mutation will introducé a PTC before the DDHD domain on protein level (p.lle463Hisfs*6). The missense mutation results on protein level in the substitution of a histidine for an aspartic acid within the RIDYXL motif (p.Asp660His) that is conserved among the DDHD domains of the three human PLA-|S DDHD1 ([MIM 614603]), DDHD2 and SEC23IP (Figure 3A). In addition, the aspartic acid at position 660 is conserved down to Drosophila melanogaster (Figure 3B). The

83 CHAPTER 4

Table 1 Raw sequencing statistics & prioritization of variants of family 1

F a m ily 1: II-2

Total number of sequenced reads (x106) 189.24

Total number of mapped reads (x106) 144.51

Total number of bases m apped (Gb) 6.7

Total bases m apping to targets (Gb) 5.8

% targets with 10x coverage 88.9%

Mean target coverage (fold) 99.9

Median target coverage (fold) 74

all variants 30,182

QC filtering 3 29,117

After exclusion of known variants 568

Affecting protein sequence or canonical splice sites 180

Of which fit a recessive disease modelb 37

Of which fit a recessive disease model after manual inspection of 11c raw sequence reads

Gene(s) in overlap with family 2 1 (DDHD2)

a >2 unique varialion reads and >20% varialion reads, b >1 varialion in a gene lor compound heterozygous variants and >80% varialion reads for candidate homozygous variants.c Four candidate compound heterozygous mutations and Ihree candidate homozygous mutations. The candidate mutations in DDHD2 were the only recessive prolein truncating mutations

c.1978G>C mutation was found once in the Exome Variant Server database consisting of 13,006 alleles, but not in our local SNP database consisting of 2,302 alleles. The other mutations were not present in either database. DDHD2 is located on chromosome 8p11.23. Previously, in a large consanguineous family of Omani origin with a similar phenotype (family 3) the genetic defect was mapped to a 9 cM interval on chromosome 8p253. This region encompasses DDHD2 as well as ERLIN2 (SPG18, [MIM 611225])253'255, b u t mutation analysis and mRNA gene expression analysis excluded ERLIN2 as the causative gene, whilst Sanger sequencing of DDHD2 revealed a homozygous c.1546C>T mutation. This mutation introducés a PTC, resulting in p.Arg516*, at the very beginning of the DDHD domain (Figure 1C). A screen of all affected and unaffected individuals showed complete segregation of this mutation with the disease phenotype in this family. The p.Arg516* change was absent from 200 alleles of healthy ethnically matched C ontrols and observed once in heterozygous state in the Exome Variant Server database, but absent from our local SNP database.

84 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Table 2 Raw sequencing statistics & prioritization of variants of family 2

Family 2: 11-2 Family 2: 11-1 Family 2:1-1

Total number of sequenced reads (x106) 152.41 122.98 120.02

Total number of mapped reads (x106) 152.40 122.98 120.02

Total number of bases m apped (Gb) 14.6 11.8 11.5

% of CCDS bases with 10x coverage 92.7 92.2 92.2

Mean target coverage (fold)a 109.5 93.3 94.3

Median target coverage (fold)a 90 77 78

all variants 154,566 143,418 142,473

QC filtering3 110,606 104,530 103,559

After exclusion of known variants 728 671 860

Affecting protein sequence or canonical 119 142 185 splice sites

Of which fit a recessive disease model 13 15 NA

Of which fit a recessive disease model 1 (DDHD2) after comparison of brother, sister and father.

Gene(s) in overlap with family 1 1 (DDHD2)

aAfter duplicate read removal. NA= not analyzed.

In order to further determine the occurrence and phenotypic spectrum of the DDHD2 mutations, we sequenced the protein coding sequence of DDHD2 in 55 additional individuals with complex HSP, in whom mutations in SPG11 were excluded and who had a family history suggestive of recessive inheritance. This resulted in the identification of one further homozygous mutation in an affected male individual from a consanguineous Iranian family. This homozygous C to T substitution in exon 9 at c.859 introducés a PTC at amino acid position 287 of the protein (family 4, Figure 1D). Reviewing the phenotypic features of all four families shows psychomotor delay that develops into intellectual disability and very early onset, Progressive spasticity (Table 3). First symptoms of spasticity of affected individuals became apparent before the age of two years. Foot contractures developed due to progressive and pronounced spasticity. In addition, strabismus and dysarthria were present in nine out of 12 individuals, frequently accompanied by dysphagia. Of note, optie nerve hypoplasia was present in three out of five individuals who had undergone

85 CHAPTER 4

F ig u re 2 DDHD2 exome experiment

Family 1: ______II-2 i

!

DDHD2 DDHD2

B Chr8: 38109471, p.lle463Hisfs*6 Chr8: 38111160, p.Asp660His

Family 2: II-2

C I 1 C C (. I. (. c C T I I I I ' I ■ I T

DDHD2 DDHD2

Raw sequencing reads of mutations in DDHD2 identified by exome sequencing and visualized with Integrative Genomics Viewer (http://www.broadinstitute.org/igv/home). A) Individual 11-2 of family 1, left panel: 19 of 71 reads (26%) showed an deletion of Aat chr8(hg19):g.38117560; right panel: 36 of 66 reads (55%) showed an of T at chr8(hg19):g.38110558-38110559. B) Both affected siblings (11-1 and II-2) of family 2 show on the left panel a heterozygous insertion of C at chr8(hg19):g.38109471 and on the right panel a heterozygous basepair substitution G to C at chr8(hg19):g.38111160. The father (1-1) shows the heterozygous insertion (left panel), but not the substitution (left panel). thorough ophthalmoiogic evaiuation. Cerebral imaging gave a consistent pattem of brain abnormalities composed of a marked thin corpus callosum combined with subtle periventricular white matter hyperintensities (Figure 4-7). In five affected individuals of families 1, 2 and 4 cerebral proton MR spectroscopy was performed and this revealed an abnormal spectrum with a lipid peak (1.3 ppm) showing

8 6 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

F igure 3 DDHD domain conservation

A

DDHD2 495 LIYKPEIFFAFGS PIGMFLTVRGLKRID---- PNYRFPTCKG 54 5 DDHD1 611 LKFKVENFFCMGSPLAVFLALRGIRPGNTGSQDHILPREICNR 666 SEC23IP 779 LDFE PEIFFALGS PIAMFLTIRGVDRID-----ENYSLPTCKG 831

DDHD2 546 DDHD1 667 SEC23IP 830

DDHD2 588 DDHD1 727 SEC23IP 872 -SSLK--- SAWQTLNEFARAHTS-S 907

DDHD2 624 —EETSVAVKEEV------L----657 DDHD1 787 SEC23IP 908 -VESPDFSKDED------Y----94 6

DDHD2 658 DDHD1 846 SEC23IP 901 IBYVLOEKPIESFNEYLFALOSHLCYWESEDTALLLLKEIYRTMN 1000

B

Homo sapiens 645 /LPINVGMLNGGdRIMYVljQEKPIESFNEYIiFAIiQSHLCYWESECiTVLLVLKEfYQTQG' 700 Mus musculus 634 A-PISVGMLNGGQ ^EKPIESFNEYLFALQSHLCYWESEI^JÏLLyLjfEfYQTjjGj 691 Danio rerio 667 VGMLNRGFjU FnXbETPIESFNEYLFAie~HLCYWESEqTAL'CTI,KEfYKH~Fg1 708 D. melanogaster 1927 ELDFPLGKLNDSKjR\ Y"ttoEAPLEFINEYIFALSSHYCYWGSEHTILFVMKE~IYAS~LG; 1985

(A) Alignment of the DDHD domain is shown for all three mammalian iPLAsi: DDHD2 (NPJ356029.2), DDHD1 (NP_001153620.1) and SEC23IP (NP 009121.1). (B) Cross-species alignment of part of the human DDHD domain for human (Homo sapiens; ENSP00000380352), mouse (Mus musculus, ENSMUSP00000096459), zebrafish (Danio rerio\ ENSDARP00000103209) and fruitfly (Drosophila melanogaster', FBpp0292462). The box indicates the conserved RIDYXL sequence motif. The position of Asp660, which is mutated in family 2, is highlighted in black. The box with the dotted line indicates the last 14 amino acids, that are altered by the p.Glu686Glyfs*35 mutation of family 1.

highest intensity around the basal ganglia/ thalamus area (Figure 4-6). This peak is similar to the characteristic, abnormal lipid peak seen in Sjögren-Larssen syndrome256 and indicative of abnormal brain lipid accumulation. Furthermore, in two male individuals from family 1 and 2, a syrinx was observed on spinal MR imaging (Figure 8). Four of the six mutations identified are predicted to introducé a premature termination codon into the DDHD2 mRNA open reading frame. Three of these are positioned more than 55 nucleotides before the last exon-exon boundary and hence might give rise to nonsense-mediated RNA decay (NMD). To test this prediction, we measured DDHD2 mRNA levels in Epstein-Barr-virus transformed lymphoblastoid cell lines (EBV-LCLs) of the affected individuals of family 1 and 4 and in a cultured fibroblast cell line from individual IV-10 in family 3 by real-time

87 CHAPTER 4

Table 3 Phenotype of individuals with DDHD2 mutations

Affected individuals Family 1 Family 2 11-1 II-2 11-1 II-2 Descent Dutch-Phillipin Canadian Consanguinity Mutation (cDNA) c.1804 1805insT; c.2057delA c.1386dupC; c.1978G>C Alteration (protein) p.Thr602llefs*18; p.lle463Hisfs*6; p.Asp660His p.Glu686Glyfs*35 Gender F M F M Age at investigation (yrs) 5 3 10 7 Clinical features ID and/or DD + + + + Hypomimia + + Strabismus - - + + Optie nerve hypoplasia + + Dysartria + + Dysphagia + Constipation + + + + Urinary incontinence + + + + Fecal incontinence + + + U pper lim bs Spasticity - - Moderate Moderate Distal weakness + + Rigidity - - - + Lower limbs Spastic paraplegia + + + + Hyperreflexia + + + + Distal weakness - - + + Pes cavus . . . - Foot contractures + + + + Radiological findings Thin corpus callosum + + + + PWMH + + + + Lipid peakb + + + + Syrinx - + NA + Mitochondrial function Normal NA Normal Normal

RelSeq accession number NM 015214 2 was used in naming mutations. a Family 3 has previously been described by Al-Yahyaee et al253, b Lipid peak at 1.3 ppm (as measured by proton MR speclroscopy), with highest signal intensity in basal ganglia/ Ihalamus area. * indicates the presence and - absence of clinical features. Abbreviations: ID - intellectual disability, DD= developmental delay, PWMH periven- tricular white matter hyperintensities, NA= not available. MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Family 3a Family 4 IV-3 IV-4 IV-10 IV-11 IV-12 IV-13 V-30 11-1 Oman Iran + + c.1546C>T c.859C>T p.Arg516* p.Arg287*

F MF F F M FM 9 10 21 15 11 10 8 30 Total + + + + + + + + 12/12 ------+ 3/12 + + + + + + NA + 9/12 NA NA NA NA NANA NA + 3/5 + + + + + + NA + 9/12

- - + + + + + - 6/12

+ - + + -- -- 7/12 ------4/12 ------3/12

Mild Mild - Mild --__ 5/12 ------+ 3/12 ------+ 2/12

+ + + + + + + + 12/12 + + + + + + + + 12/12 + + + + + + NA + 9/12 -- + + ---- 2/12 + + + + + + + + 12/12

+ + NA + + + + + 11/11 + + NA + + + + + 11/11 NA NA NA NANA NA NA + 5/5 NA NA NA NANA NA NA - 2/4 NA NA NA NA NA NANA NA 0/3

89 CHAPTER 4

Figure 4 Cerebral imaging of family 1

Cerebral imaging of family 1. Upper panels: Mid-sagittal T1-weighted MRI of the brain showing a marked thin corpus callosum in both affected siblings (white arrow). Middle panels: Transverse T2-weighted MRI of the brain showing subtle white matter hyperintensities in the same individuals, Bottom panels: Proton MR spectra obtained at a magnetic field of 1.5 Tesla. Voxel was fixed just cranial to the basal ganglia/ thalamus area, from which a proton MR spectrum long echo time (144 ms) was obtained showing the prominent pathologie lipid peak at 1.3 ppm, apart from the common spectral peaks of choline (Cho), creatine (Cr), and N-acetylaspartate (NAA).

90 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Figure 5 Cerebral imaging of family 2

11-1 NAA NAA Cr Cho Cr Cho Cr2 Cr2

Cerebral imaging of family 2. Upper panels: Mid-sagiffal T1-weighted MRI of the brain showing a marked thin corpus callosum in both affected siblings (white arrow). Middle panels: Transverse T2Flair-weighted MRI of the brain showing subtle white matter hyperintensities in the same individuals. Bottom panels: Proton MR spectra obtained at a magnetic field of 1.5 Tesla. Voxel was fixed at the basal ganglia area, from which a proton MR spectrum at short echo time (35 ms) was obtained showing the prominent pathologie lipid peak at 1.3 ppm, apart from the common spectral peaks of choline (Cho), creatine (Cr), and N-acetylaspartate (NAA).

91 CHAPTER 4

F igure 6 Cerebral imaging of family 3

Cerebral imaging of family 3253. Left panel: Mid-sagittai T1-weighted MRI of the brain showing a marked fhin corpus callosum in affected individual IV-3 (white arrow). Right panel: Transverse T2-weighted MRI of the brain showing subtle white matter hyperintensities in the same individual.

quantitative RT-PCR analysis. In affected individuals of family 1, we observed 84% expression of DDHD2 mRNA levels as compared to eight control individuals (p=0.04; Figure 9). As NMD usually results in 15-58% expression as compared to Controls257' 258, this suggests NMD for only one of the two alleies, which is in line with the fact that one of the mutations, c.2057delA, is located in the last coding exon and hence does not target the mRNA for NMD. This mutation, however, is predicted to alter the protein sequence of the last 14 amino acids of the DDHD2 domain, of which seven amino acids are conserved down to the Drosophila melanogaster (Figure 3B). For family 4, 46% expression of DDHD2 mRNA levels was observed (p<0.0001), which suggests that both alleies are subject to NMD. For individual IV-10 of family 3, we found eight-fold reduction of expression of DDHD2 mRNA as compared to two healthy control samples. As expected for mutations that result in NMD, DDHD2 mRNA levels in families 1 and 4 were restored to normal when cells of affected individuals were treated with protein translation inhibitor cycloheximide (Figure 9). Further mRNA studies, comparing DDHD2 expression level amongst different human tissues shows markedly high expression in the adult central nervous system which is in line with the observed phenotypic features (Figure 10). DDHD2 (DDHD domain containing 2) belongs to the mammalian intracellular phospholipase ^ (iPLA-i) family. This protein family consists of three paralogs- DDHD1, DDHD2 and SEC23IP- that share a conserved lipase motif (GxSxG) and a DDHD domain, They exhibit enzymatic activity and hydrolyze an acyl group of phospholipids at the sn-1 position. All three members have been implicated in

92 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Figure 7 Cerebral imaging of family 4

11-1 | l l - 1

Cerebral imaging of family 4. Upper leff panel Mid-sagiffal T1-weighfed MRI of the brain showing a marked thin corpus callosum (white arrow). Upper right panel: Transverse T2-weighted MRI of the brain showing subtle white matter hyperintensities. Bottom panel: Proton MR spectrum obtained at a magnetic field of 3 Tesla. Voxel wasfixed at thethalamus area, from which a proton MR spectrum at short echo time (30 ms) was obtained showing the prominent pathologie lipid peak at 1.3 ppm, apart from the common spectral peaks of choline (Cho), creatine (Cr), and N-acetylaspartate (NAA). organelle biogenesis and membrane trafficking224’ 225’ 259' 26°. From N- to C- terminal, DDHD2 contains a WWE domain, a GxSxG lipase motif, and a sterile alpha motif (SAM) and DDHD domain (Figure 1E). The DDHD domain is a multi- functional domain that, together with the SAM domain, is essential for its

93 CHAPTER 4

Figure 8 Spinal imaging of family 1 and 2

Sagittal T2-weighted MRI of the spine of affected individual il-2 of family 1, and II-2 of family 2, both showing a spinal syrinx (white arrow). phospholipase activity224. DDHD2 preferentially hydrolyzes phosphatidic acid (PA), but also exhibits activity towards several other phospholipids, such as phosphati- dylethanolamine. The protein localizes to the cis-GoIgi and also to the ER-Golgi intermediate compartment224.225' 261. Furthermore, RNA interference experiments in cellular systems have indicated a function for DDHD2 in Golgi to plasma membrane transport224. In view of this, it has been hypothesized that DDHD2 determines (local) membrane curvature and facilitates membrane and vesicle fusion by the modification of membranes through phospholipid hydrolysis261. Given the function of DDHD2 in phospholipid metabolism and the observed lipid accumulation by brain proton MR spectroscopy, we investigated fibroblasts of affected individuals for abnormalities in lipid metabolism. Oil Red O staining of fibroblasts of individuals 11-1 of family 1, II-2 of family 2 and 11-1 of family 4 showed no difference in appearance and number of lipid droplets between affected individuals (n= 3) and Controls (n = 3) (Figure 11). We also investigated the effect of the mutations on cellular organelle morphology because DDHD2 has been described as intracellular transport protein involved in organelle biogenesis224’225' 259,260 Electron microscopy on fibroblasts from individuals 11-1 of family 1, IV-10 of family 3 and 11-1 of family 4 showed normal appearance of endoplasmic reticulum, golgi, mitochondria and nucleus for fibroblasts of affected individuals (n=3) and no difference in organelle distribution as compared to fibroblasts of control individuals (n = 4) (Figure 12-13). In all fibroblast cell lines dense lysosomes and small empty vacuoles could be found in various amounts, however, in a minority of cells from

94 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Figure 9 Effect of mutations in family 1 and 4 on DDHD2 expression

150-, p^0 .04

p p 0001 E no

Family 1 Family 4 Controls n= 2 n-1

Effect of the compound heterozygous mutations, c.1804_1805insT and c.2057delA, from family 1, and c.859C>T from family 4 on DDHD2 expression in Epstein Barr Virus transformed Lymphoblastoid cells (EBV-LCLs). Shown is the mean expression of DDHD2 in EBV-LCLs of affected individuals of family 1 (n=2), fam ily 4 (n=1) and Controls (n=8), treated without (-CH) and with cycloheximide (+CH). Quantifica- tions were performed in duplicate and normalized against GUSB and PPIB. Differences in expression between samples of the affected individuals and eight Controls were calculated by the comparative Ct or 24ACt method216' 217. The p-value was derived from the Standard score (Z-value) calculated for each individual as compared to the normal distribution of the eight Controls. Since we assume a lower expression level as a result of nonsense-mediated decay (NMD), a one-sided test was enough to reject the null hypothesis, i.e., no statistically significant difference between the expression of DDHD2 in EBV-LCLs of an affected individual and that in EBV-LCLs of Controls. We used an alpha level of 0.05, because only one gene was assessed. Inhibition of translation was obtained by treatment of EBV-LCLs with 20 /jl cycloheximide (concentration: 100 mg/ml DMSO) that was added to the medium and incubated for 4 hours at 37°C. individual 11-1 of family 1 large empty vacuoles could be seen localized at the cellular membrane and surrounded by a single membrane and glycogen (Figure 12,1). These large vacuoles were not observed in fibroblast cells of any of the other affected individuals nor in the control individuals and thus are of uncertain significance. These results do not support gross abnormalities in lipid metabolism or organelle morphology in cultured fibroblast cells of affected individuals. To further demonstrate an essential function for DDHD2 in the central nervous system, we targeted the Drosophila homologue of DDHD2, CG8552, using the UAS-Gal4 system262 and inducible RNA interference (RNAi)263 with the pan-neuronal UAS-dicer2; elav-Gal4 driver. Of note, CG8552 (from here on referred to as Ddhd) is equally related to DDHD2 and its paralogue SEC23IP, one of the other human intracellular iPLAsi. Three different Ddhd RNAi lines (vdrcGD35956, vdrcGD35957

95 CHAPTER 4

Figure 10 Expression of DDHD2 in different human tissues

30i

20

x> 33 o 10

i•c l it • t Otl C F >« nj _ o ! |j. 2 3 O S ^ c ïS £ ■C2 3W : JCo -o^ o —-5 £ c —o. =>, ■C 3---OJ3 S s £ a>, E o o tox E -§■*"£ 5 a0) -o o .2 -o

fetal tissue ad uit tissue adult brain tissue

Expression of DDHD2 by mRNA expression analysis in human fetal and adult tissues. Relative expression levels are given as the fold change in comparison to the tissue with the lowest expression level. Quantifi- cations were performed in duplicate and normalized against GUSB and PPIB. Differences in expression between tissues were calculated by the comparative Ct or 24iCt method216'217. and vdrcKK108121), representing two non-overlapping RNAi target sequences, were utilized and compared to two control fly lines that represent the same genetic background as the mutant lines (vdrcGD60000 for vdrcGD35956 & vdrcGD35957, and vdrcKK60100 for vdrcKK108121). The effect of neuron-specific Ddhd knockdown on synaptic and sub-synaptic organization was studied at the neuromuscular junction (NMJ). The Drosophila larval NMJ is a well-established synaptic model system that shares major features with central excitatory synapses in the mammalian brain264 and has successfully been used to characterize a number of Drosophila models of neurological diseases including HSP and ID disorders265'268. Staining of these synapses with an antibody against the scaffolding protein disc large 1 (dlg1) highlights the overall NMJ morphology. Quantitative measurement of the amount of synaptic branches and branching points revealed normal overall architecture of

96 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Figure 11 Oil Red O staining of Cultured fibroblasts of Controls and affected individuals

B

Family 1,11-1 Control 1

100 urn_ Family 2.11-2 Control 2

100 pm_ Family 4.11-1 Oil Red O control slaining

Representative fibroblast cells of individuals 11-1 of family 1 (A), II-2 of family 2 (B) and 11-1 of family 4 (C) and two control individuals (B,D). (F) Positive control. Cells were stained with Oil Red O to visualize lipid droplets. There was no difference observed in the number and appearance of lipid droplets between cells of affected individuals (n=3) and Controls (n=3). Cells were seeded and cultured overnight on glass slides, fixed in 3,7% formalin for 10 minutes, rinsed in demineralized water and stained for 30 minutes in filtered Oil Red O solution dissolved in isopropanol and rinsed in demineralized water followed by a brief counterstaining in hematein. After rinsing in tap water for 10 minutes the slides were sealed by using Xylol-based mounting medium. synaptic terminals in all three knockdown conditions. Synaptic terminals of one of the knockdown lines (vdrcKK108121) were smaller, as reflected in both area (91%, p=0.012) and shorter length (88%, p=0.001) (Figure 14). We also determined the amount of chemical synapses, so-called active zones, within synaptic terminals, which represent the pre-synaptic sites of neurotransmitter release. Active zones were visualized with an antibody against the active zones component bruchpilot

97 CHAPTER 4

Figure 12 Electron Micrographs (EM) of human fibroblasts of affected individuals and Controls

Control 1

Control 2

Family 1, 11-1

Family 3, IV-10

Family 4, 11-1

Representative electron micrographs of two control individuals (A-F) and individuals 11-1 of family 1 (G-l), IV-10 of fam ily 3 (J-L) and 11-1 of fam ily 4 (M-O). Left panels 1.2K (A,D,G,M) and 2K (J) im ages. M iddle and right panels 6K images. For panels A-l and M-O: Spun down pellets of cultured fibroblasts were fixed in 2% glutaraldehyde in 0.1 M Phosphate buffer for 4 hours, rinsed in Phosphate buffer and postfixed for 1 hour in 1% Osmium containing 1% Kaliumhexacyanoferrat. Semithin (1 ^m) sections for previewing and ultrathin (70 nm] sections were cut on an ultramicrotome, Leica EM UC6, collected on 200mesh copper grids and contrasted with uranyl acetate and lead citrate double stain (as previously described by Sjostrand et al.273 and Reynold et al.2711). For panels J-K: Cultured fibroblasts were fixed with formalde- hyde-glutaraldehyde method as described by Karnovsky et al275. Semithin (1,30 |xm) and ultrathin (95 nm) sections were prepared and stained with 1% aqueous toluidine blue on glass slides and 200mesh Cu grids, respectively, and contrasted with uranyl acetate and lead citrate double stain273' 274. All sections were examined and images generated in a Jeol JEM1200 Transmission Electron Microscope (Jeol, Netherlands). Abbreviations: N= Nucleus, Nuc=nucleolus, ER= endoplasmic reticulum, M= Mitochondria. Red asterisks indicate vacuoles, arrows indicate glycogen.

98 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

Figure 13 Confocal images of cultured fibroblast cells

CONTROL FIBROBLAST CELLS FAMILY 3, IV-10 FIBROBLAST CELLS

ƒ>■

Confocal images of represenfative cultured fibroblast cells from affected individual IV-10 of family 3 and a healthy control individual. Fibroblast cells were cultured on sterile cover slips, fixed by methanol, incubated with monoclonal or polyclonal anti- Calanexin, -Golgin-97 and -COPII antibodies and re-incubated with the appropriate rhodamine-labelled secondary antibodies. Finally, fixed cells were mounted in immunofluor medium (ICN Biomedicals, Irvine, California, USA). Data were acquired using a Nikon confocal microscope 1. Images, presented as single sections in the z-plane, were prepared using Adobe Photoshop (Adobe Inc.). Biomarkers Calnexin (A, B), Golgin-97 (C, D) and a COPII subunit (E, F) were used to stain respectively endoplasmic reticulum, the Golgi complex and endoplasmic reticulum exit sites in fibroblasts from individual IV-10 of family 3, and from a control. All three biomarkers show normal organelle distribution in all cells.

99 CHAPTER 4

Figure 14 Quantification of NMJ morphology in Drosophila lavae

" ï vdrcGDCOOOO vdroGtMMÖ»)ii vdrcGOJSttt/ vcfrcKK60100 v

Longest branch length Branching points

B i . É X j_____ L . J.c - • X.- I . JrcGtXiii ■1 ” «JrcGI 159V vdrcKK601i- vdrcKKlORI. : vtr >tJ' I.'.'l.,/ ur, viï KKi 1121

vdtcGf- ' O vdn .D ir-'y j vu c 1 I \ . ■ JrcKK'C ‘ vdrcKK108121

Synapse morphology and organization at the Drosophila neuromuscular junction of control and CG8552 (Ddhd) knockdown flies. Three different RNAi lines, vdrcGD35956, vdrcGD35957, and vdrcKK108121, from the Vienna Drosophila Research Center, were used and compared to their genetic background lines, vdrcGD60000 and vdrcKK60100, respectively. Quantification of NMJ area (A), branch length (B), longest branch length (C), number of branching points (D), number of branches (E) and number of island (F).

1 0 0 UTATIQNS IN DDHD2 CAUSE COMPLEX HSP

Figure 15 Synapse morphology and organization at the Drosophila neuro- muscular junction of control and CG8552 (Ddhd) knockdown flies

A Control Ddhd knockdown

■m • • digi

— ..J ji •* (»•

brp tg)

macro 20 pm

- 3 r T

200^- 01_ vdrcGD60000 vdrcGD3M56 vdicGD35957 vdrcKK60100 vdrcKK108121

Three different RNAi lines, vdrcGD35956, vdroGD35957, and vdrcKK108121, from the Vienna Drosophila Research Center, were used and compared to their genetic background lines, vdrcGD60000 and vdrcKK60100, respectively. RNA interference was induced with the pan-neuronal UAS-dicer2; elav-Gal4 driver. Drosophila muscle 4 type 1b neuromuscular junctions (NMJs) were analyzed as previously described268 (A) anti-dlg1 (upper panel) and anti-brp immunolabelling (middle panel) at the NMJ of control (vdrcGD60000) and Ddhd knockdown larvae (vdrcGD35956) and output of computer-assisted analysis with an in house-developed macro (bottom panel). Each white dot represents one active zone. (B) Quantification of active zones shows a significant reduction in all three RNAi lines compared to their genetic background Controls. P= P-values (two-sided T-tests), n = number of quantified synaptic terminals. Error bars indicate Standard error of the mean.

(brp). Knockdown of Ddhd resulted in a mild but highly significant decrease in active zone number per synaptic terminal as compared to the appropriate genetic background Controls. This phenotype was consistent in all three RNAi lines

101 CHAPTER 4

(vdrcGD35956, 90%, p=0.029; vrdcGD35957, 80%, p<0,0001 and vdrcKK108121, 88% p=0.002, Figure 15). No obvious motor abnormalities were observed in the three knockdown lines of two and ten days old flies. Since brp and active zones are crucial for synaptic transmission and plasticity269, these observations support an essential role for the human DDHD2 in synaptic organization and transmission. We also compared the clinical characteristics of the emerging DDHD2 phenotype to complex HSP phenotypes that give rise to autosomal recessive spastic paraplegia with thin corpus callosum and white matter abnormalities, i.e. SPG11, SPG15, SPG21, SPG44, and SPG47. This shows a clear clinical overlap of the DDHD2 phenotype with SPG11 (SPG11), SPG15 (ZFYVE26) and SPG47 (AP4B1). However, individuals with both SPG11 and SPG15 show in general a later age of onset of spasticity, namely in the second decade of life. Although individuals with SPG47 do show onset in the first decade of life, they present with more severe intellectual disability than in our families. The overlap with individuals with SPG21 and SPG44 is limited to the brain pattern of thin corpus callosum and white matter abnormalities247 The proteins encoded by SPG11, ZFYVE26 and AP4B1, spatascin, spastizin, and AP4B1 respectively, are all three proposed to be involved in intracellular trafficking226 228. The overlap in function between these three proteins and DDHD2 might relate to the phenotypic overlap. Notably, abnormal lipid accumulation on magnetic resonance spectroscopy has not been reported in any of these complex autosomal recessive HSPs270'272 and this finding therefore represents a valuable parameter to distinguish the DDHD2 phenotype from the other complex HSPs. In conclusion, we identified recessive mutations in DDFID2, an iPLAi, and define a complex form of HSP, designated SPG54. The core phenotype of the mutations in DDHD2 consists of very early onset of spasticity, ID, and a specific pattern of structural and metabolic brain abnormalities, the latter representing a useful distinguishing biomarker in clinical evaluation. An essential role for DDHD2 in lipid metabolism in the central nervous system is underlined by lipid accumulation that we observed in the brains of affected individuals, by markedly high expression of DDHD2 mRNA in human central nervous system, and by the reduced number of active zones at synaptic terminals of the Drosophila Ddhd knockdown nervous system.

Acknowledgements We are grateful to the studied individuals and their families for their support and cooperation. We thank Prof. Dr. N. Knoers for her contribution in the collection of the follow-up cohort and A. Heister for homozygosity analysis is family 4. Part of this work was completed with collaboration of the FORGE Canada Consortium, supported by the Government of Canada through Genome Canada, the Canadian Institutes of Health Research (CIHR) and the Ontario Genomics Institute (OGI-049),

102 MUTATIONS IN DDHD2 CAUSE COMPLEX HSP

and part by the Netherlands Organization for Health Research and Development (ZonMW; VIDI grants 917-86-319 to BBAdV and 917-96-346 to AS), the GENCODYS project (EU-7th-2010-241995 to HvB, AS, ATVvS and BBAdV) and the Dutch Brain Foundation (APMdB; 2010(1)-30 and BBAdV; 2009(1)-22). The laboratories of LA and BRA are funded by the Dubai Harvard Foundation for Medical Research and the United Arab Emirates University. This study was approved by the Medical Ethics Committee of the Radboud University Nijmegen Medical Centre, and all participants signed informed consent. Authors declare no conflict of interest.

Web Resources The URLs for data presented herein are as follows: Online Mendelian Inheritance in Man. URL: http://www.omim.org Exome Variant Server, NHLBI Exome Sequencing Project (ESP), Seattle, WA. URL: http://evs.gs.washington.edu/EVS/ [May 2012] FASTX-Toolkit. URL: http://hannonlab.cshl.edu/fastx_toolkit/

103

Recurrent de novo mutations in PACS1 cause defective cranial neural crest migration and define a recognizable intellectual disability syndrome

(the American Journal of Human Genetics, in press)

Janneke H.M. Schuurs-Hoeijmakers'Edwin C Oh? *. Lisenka E.L.M Vissers' \ Mariëlle E.M. Swinkels3. Christian Gilissen'. Michèl A. Willemsen-15, Maureen Holvoet6, Marloes Steehouwer’, Joris A. Veltman'. Bert BA. de Vries15, Hans van Bokhoven1-5, Arjan P.M. de Brouwer'5, Nicholas Katsanis27*. Koenraad Devriendt6# , Han G. Brunner1'

' First authors contributed equally to this study, * Last authors contributed equally to this study

1 Department of Human Genetics - 855, Nijmegen Centre for Molecuiar Life Sciences and Institute for Genetic and Metabolic Disease, Radboud University Nijmegen Medical Centre, PO Box 9101.6500 HB Nijmegen. The Netherlands 2 Center for Human Disease Modeling, Duke University, Durham NC 27710. USA 3 Medical Genetics, University Medical Center Utrecht, 3508 AB Utrecht, The Netherlands 4 Department of Pediatrie Neurology, Radboud University Nijmegen Medical Centre, PO Box 9101,6500 HB Nijmegen, The Netherlands. 5 Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen. PO. Box 9104, 6500 HE Nijmegen, The Netherlands 6 Center for Human Genetics, Clinical Genetics, University Hospitals Leuven, Herestraat 49 BUS 602,3000 Leuven, Belgium. 7 Departments of Cell Biology and Pediatrics, Duke University, Durham NC 27710, USA CHAPTER 5

Abstract

We studied two unrelated boys with intellectual disability and a striking facial resemblance suggestive of a hitherto unappreciated syndrome. Exome sequencing in both families identified identical de novo mutations in PACS1, suggestive of causality. To support these genetic findings and to understand the pathomechanism of the mutation, we studied the protein in vitro and in vivo. Mutant PACS1 forms cytoplasmic aggregates in vitro with concomitant increased protein stability and shows impaired binding to an isoform-specific variant of TRPV4, but not full-length protein. Further, consistent with the human pathology, expression of mutant PACS1 mRNA in zebrafish embryos induces craniofacial defects likely in a dominant-neg- ative fashion. This phenotype is driven by aberrant specification and migration of SOXIO-positive cranial, but not enteric neural crest cells. Our findings suggests that PACS1 is necessary for the formation of craniofacial structures, and that perturbation of its functions results in a specific syndromic ID phenotype.

106 RECURRENT MUTATIONS IN PACSI CAUSE A RECOGNIZABLE ID SYNDROME

Intellectual disability (ID) affects more than 1-3% of the population and has a strong genetic component. Despite technical progress, establishing a genetic diagnosis remains challenging, in part because of substantial genetic heterogeneity and clinical variability15. Recent data have indicated a high rate of de novo events in

I D 1 15, 1 2 3 ,134_ suggesting that family-based exome sequencing can be an efficient tooi to identify genetic causes of ID and thus probe its molecular etiology. We recruited two unrelated boys with unexplained ID and a remarkable facial resemblance (Figure 1A). The first boy is the only child of unrelated Dutch parents. A paternal nephew has developmental delay of different severity, and physical appearance compared to individual 1 and was therefore considered to have a distinct clinical condition. Individual 1 was born at term, by vacuum extraction, after an uncomplicated pregnancy. The mother was treated with mesalazine for Crohn’s disease throughout pregnancy. Birth weight was 3250 gram (25,h centile) and apgar score 5/6/8. A single umbilical artery was noted. On the second day of life, he developed seizures, which were successfully treated with anti-epileptic medication. Four weeks after birth, the boy developed volvulus by intestinal malrotation. Resection of a large part of the small intestine was performed upon emergency laparatomy. He developed short bowel syndrome for which he received total parenteral feeding until age 4 years, thereafter replaced by tube feeding. A vesico-urethral reflux grade II resolved spontaneously. Left-sided cryptorchidism was surgically corrected by orchidopexy. During this operation, a streak testis was observed on the right side. Development was delayed: he was able to sit with support at age 10 months, walked at age 3 4/12 years and spoke his first words at age 3 6/12 years. Language production was more delayed than verbal understanding and dyspraxia was noted. His IQ was measured as <50. On physical examination we saw a friendly boy with some stereotypie movements, At age 3 6/12 years his weight was 15 kg (16th centile), length 102 cm (50th centile) and OFC 49 cm (16th centile). His facial features were characterized by a low anterior hairline, hypertelorism with downslanting palpebral fissures, mild synophrys with highly arched eyebrows, long eyelashes, a bulbous nose, a flat philtrum and large, low-set ears (6 cm, 97th centile). He has a wide mouth with downturned corners, a thin upper lip and diastema of the teeth (Figure 1A). Widely spaced nipples, slenderfingers, but broad and short thumbs, clubbing nails, a single transverse palmar crease on the left hand and pes planus were also noted. Neurological examination showed simple motor patterns without specific pyramidal, extrapyramidal, cerebellar or neuromuscular abnormalities. Cerebral MRI showed a cavum septum pellucidum, but was otherwise normal. Conventional karyotyping as well as SNP array testing (Affymetrix, 250K) showed a normal male karyotype. Because of some facial similarities to Cornelia de Lange Syndrome ([MIM 122470]), NIPBL ([MIM 608667]), SMC1A ([MIM 300040]) and SMC3 ([MIM 606062]) were sequenced for mutations, but none were found.

107 CHAPTER 5

F igure 1 Photographs and genetic data of two unrelated individuals with an identical de novo mutation in PACS1 (NM 018026.2)

Tyr Lys Asn trp Thr_____[le_ TACAAG AATC GGACCATC

Individual 1 A'\AAaA / 1' \AAA f\i}[\A \ AA M°ther A •A'\/\A! 'i V \ Aa A; A A ’ A A Father /V^\AA/jAAAAA/M}i\A

Individual 2

Mother Father

Tyr Lys Asn Trp Thr lle TACAAG AAT C GGACCATC

Individual 2 A^AaaaAaAaAaAAAiaA

M°ther / V\A\a a / >/\AAaa/V\A\aa Father ^Ava/W\AaaMA\aA

R196RKRY CK2 binding motif v Autoregulatory domain

C-terminal region

p.(Arg203T rp)

(A) Upper photograph: Individual 1 at 4 years of age; low anterior hairline, highly arched eyebrows, synophrys, hypertelorism with downslant of the palpebral fissures, long eyelashes, bulbous nasal tip, flat philtrum with thin upper lip, downturned corners of the mouth, diastema of teeth and low-set ears. Bottom photograph'. Individual 2 at 12 years of age. Note the remarkable facial similarity. (B) Sequence reads from exome sequencing and chromatograms of Sanger confirmation, showing the identical de novo occurrence of the c.607C>T mutation in PACS1 in family 1 & 2. (C) Protein structure of PACS1, indicating the position of the p.Arg203Trp substitution positioned in the Furin (cargo)-binding region (FBR) of the protein, directly adjacent to the CK2-binding motif.

108 RECURRENT MUTATIONS IN PACS1 CAUSE A RECOGNIZABLE ID SYNDROME

The second boy is the second child of healthy, unrelated parents of Belgium origin. There was one previous miscarriage. Family history is negative with regard to developmental delay or congenital malformations. The boy was bom at term by caesarian section because of breech presentation. Birth weight was 4250 gram (90th centile), length 54 cm (97th centile) and OFC 36 cm (90th centile). His development was delayed. He walked at the age 2 6/12 years. His IQ was measured as 53. Currently, at age 19 years, he functions well in a special school and except for mild scoliosis has no medical problems. When first seen at the age of 6 3/12 years, he appeared friendly and outgoing. Weight was 22 kg (50th centile), length 119 cm (50th centile) and OFC 51,7 cm (50th centile). Measurements at age 19 years: weight 64,5 kg (16th centile), length 181 cm (50th centile) and OFC 55,1 cm (16th centile). His facial features were characterized by hypertelorism with downslanting palpebral fissures, strabismus, long eyelashes and mild ectropion, highly arched eyebrows, downturned corners of the mouth, a narrow upper lip especially in its middle part, and a flat philtrum (Figure 1A). Teeth were widely spaced and the ears were low-set. He had a short neck, widely spaced nipples and a mild pectus excavatum. He had clinodactyly and shortness of the fifth fingers and mild cutaneous webbing of the fingers. The skin showed multiple pigmented nevi. He had a small umbilical hernia, hypoplastic scrotum and cryptorchidism on the right side. Neurological examination showed balance problems and mild dysarthric speech. He was hypotonic. Conventional karyotype and array-CGH (Agilent 180k) showed a normal male karyotype. A CT scan of the brain revealed a partial agenesis of the vermis and hypoplasia of the cerebellar hemispheres, which was more pronounced on the right side. Because of their striking similar facial dysmorphisms that were unique among our cohort of >5,000 individuals with ID, we considered this a distinct dominant syndrome with plausibly a common genetic defect. We therefore performed exome sequencing on DNA from both index-parent trios, using the ABI SOLiD™ 4 platform (described previously by Vissers et a/.115, Life Technologies, Carlsbad, CA, USA), aiming at the Identification of a causal de novo mutation (Table 1, Figure 2). Seven potential de novo non-synonymous variants were identified in individual 1 and six in individual 2 (Table 2). Sanger validation confirmed two as de novo in individual 1 and one in individual 2. Remarkably, the same de novo c.607C>T mutation in PACS1 (NM_018026.2, [MIM 607492]) was identified in both individuals (Figure 1B). This mutation is predicted to result in an arginine to tryptophan substitution, p.Arg203Trp (Figure 1C), at an evolutionarily invariant position in both PACS1 and its close paralog PACS2 (Figure 3). The c.607C>T mutation was absent in 150 alleles of control individuals of the Dutch population, 2,304 alleles present in our local variant database which is derived from exome sequencing experiments, and in 7,020 alleles of European American origin from the NHLBI Exome Sequencing Project.

109 CHAPTER 5

Table 1 Raw sequencing statistics

Individual Father Mother Individual Father Mother Averagi 1 2

Total number of 135,7 119,3 110,8 118,7 138,2 128,2 125,2 sequenced reads (x106)

Total number of 102,1 87,6 83,3 88,8 99,1 98,2 93,2 mapped reads (x106)

Total number of bases 4.85 4.15 3.95 4.06 4.52 4.52 4.34 m apped (Gb)

Total bases uniquely 4.03 3.43 3.29 3.31 3.64 3.68 3.56 m apped (Gb)

Total bases m apping 3.78 3.25 3.00 3.36 3.78 3.85 3.51 to targets (Gb)

% targets with 10x 81.8 80.0 78.4 81.6 84.1 83.5 81.6 coverage

Mean target coverage 72 62 57 66 75 76 68

Median target coverage 53 46 42 48 54 56 50

PACS1 is a trans-golgi membrane traffic regulator276' 277 that directs protein cargo and several viral envelope proteins277'279. PACS1 mRNA expression is up regulated during human embryonic brain development, with low expression after birth (BrainSpan: Atlas of the Developing Human Brain). PACS1 contains a furin cargo binding region (FBR), bearing a CK2-binding motif, an autoregulatory domain, and N- and C-terminal ends of unknown function. Our p.Arg203Trp substitution is positioned in the FBR directly adjacent to the R196RKRY CK2-binding motif that regulates the phosphorylation status of the autoregulatory domain, and PACS1 activation280' 281 (Figure 1C). The chance to observe an identical de novo basepair change in two individuals is extremely small, indicating that our recurrent finding of an exact same de novo base pair change independently in these individuals with identical clinical presentation and its absence from —9,000 control alleles strongly supports causality. However, to provide further evidence and to probe the mechanistic basis of the dysmorphic phenotype, we studied the behavior of the p.Arg203Trp substitution in craniofacial cartilaginous structures in zebrafish embryos. We injected either wildtype (c.607C resulting in p.Arg203) or mutant (c.607T resulting in p.Trp203) human PACS1 mRNA into 2-4 cell stage zebrafish embryos. On scoring

110 RECURRENT MUTATIONS IN PACS1 CAUSE A RECOGNIZABLE ID SYNDROME

Figure 2 Coverage plots

X-fold coverage

Coverage for all exons targeted by enrichment was evaluated. The median coverage over six individuals was 50-fold, with on average 81.6% of all targets covered at least 10-fold. Individuals 1 and 2 are represented by solid black and gray lines, respectively, with the parental samples in dotted lines and respective colors.

Table 2 Variant prioritization

Individual 1 Individual 2

All variants 26,980 23,120

QC filteringa 23,096 19,748

After exclusion of nongenic, intronic & 5,425 4,993 synonymous variants

After exclusion known variants 116 133

After exclusion inherited variants 7 6 Genes in overlap in both individuals 1 (PACS1)

“ è5 unique variant reads and ï20% ol all reads

111 CHAPTER 5

Figure 3 Cross-species alignment of PACS1

Homo sapiens PACS- 1 17 6 SLQYPHFLKRDANKLQIMLQ RRKRY JTILGYKTLAVGLINMAEVMQHPNEGAL 230 Homo sapiens PACS- 2 30 SLQYPHFLKREGNKLQIMLQ RRKRYK : Itilgyktlaagsismaevmqhpseggq 85 Mus musculus 174 SLQYPHFLKRDANKLQIMLQ RRKRY< It ILGYKTLAVGLINMAEVMQHPNEGAL 229 Gallus gallus 206 SLQYPHFLKRDGNKLQIMLQ RRKRYK ITILGYKTLAVGIINMAEVMQHPTDGGQ 2 61 Danio rerio 111 SLQYPHFLKRDANKLQIMLQ RRKRY It ILGYKTLALGMINMAEVMQHPTEGAQ 166 D. melanogaster 190 SLQYPHFIKRDGNRLVILLQ RRKKY ITILGYKTLAEGIIRMDAVLQKSMD--M 242 C. elegans 89 CIQYPHFLKRKSNVLQILIQ RRKKFK ; jLPGGLRDLAVGNINLTYIMQ-- QGGL 140

Cross-species alignment for human PACS1 (homo sapiens, NP_060496) and PACS2 (Homo sapiens; NPJ301230056), mouse (Mus musculus; NP_694769), chicken (Gallus gallus; XP_419325), zebrafish (Danio rerio', NP001092218), fruitfly (Drosophila melanogaster, NP_524473) and roundworm (Caenohabditis elegans', NP_505387.1). Depicted is the amino acid sequence containing the CK2 binding motif (white box), the position of the p.Arg230Trp substitution (black box) and the surrounding sequence.

Alcian Blue-stained four-day-old embryos injected with either 50 pg wildtype or 50 pg mutant PACS1 mRNA (n=61 embryos per batch, scored blind to injection cocktail), we observed that embryos expressing mutant PACS1 showed a significant reduction in cranial cartilaginous structures at the ventral aspect (p<0.001) relative to embryos expressing wildtype PACS1 -the latter being indistinguishable from uninjected control zebrafish (Figure 4A, Figure 5B), The induction of a craniofacial phenotype upon overexpression of mutant human PACS1 mRNA argues against loss-of-function effect of the mutation, but cannot differentiate between a gain-of- function or dominant-negative mechanism. To examine these possibilities, we injected equimolar ratios of 50 pg wildtype and 50 pg mutant mRNA together. We observed a significant rescue of the mutant craniofacial phenotype (n=55; p<0.001), suggestive of a dominant-negative mechanism (Figure 4A). The loss of craniofacial structures upon expression of mutant PACS1 mRNA may be the result of defective migration of cranial neural crest cells (CNCC), progenitors that give rise to the majority of skeletal and connective tissues in the face282'284. As an initial test of this hypothesis, we injected wildtype or mutant RNA in 2-4 cell stage sox10::eGFP transgenic zebrafish embryos, which express GFP in CNCCs. Analysis of four-day-old embryos showed a significant reduction in the migration of GFP-labeled cells (n= 55; p<0.001) in the anterior-most region of the embryo (head), confirming that the absence of Alcian Blue staining is at least in part due to the loss of CNCC-derived progenitors (Figure 4B). To examine the specificity of the CNCC migration phenotype, we isolated RNA from the trunk and head regions of injected embryos and analyzed sox10::eGFP-positive cells harvested from enteric neural crest cells (ENCC) and CNCC. We observed a significant reduction in GFP mRNA levels in the head. The phenotype was specific to this region; we observed no differences in GFP mRNA levels in the trunk of injected embryos (Figure 6). Together, these data suggest that PACS1 can promote

112 RECURRENT MUTATIONS IN PACS1 CAUSE A RECOGNIZABLE ID SYNDROME

Figure 4 In vivo functional characterization of the p.Arg203Trp substitution in PACS1

Lateral Dorsal

Arg203 tM

, m \

< Trp203 1

■ i ------■ Normal — » Craniofacial abnormalitles

Arg203 Trp203 Arg203/Trp203

(A) Alcian blue staining of 4-day-old zebrafish larvae expressing either 50 pg w ildtype (WT; C.607C resulting in p.Arg203) or 50 pg m utant (C.607T resulting in p.Trp203) PACS1 RNA. Left panel; craniofacial cartilaginous structures visualized in both lateral and ventral views of the embryo. Right panel; craniofacial phenotypes in embryos expressing WT PACS1, m utant PACS1, and both WT and mutant PACS1 combined, are quantified. White arrows and asterisks highlight Meckel’s cartilage in the lateral and ventral perspectives of the embryos. Human PACS1 WT and mutant mRNA were/'n vitro transcribed using an mMESSAGE mMACHINE SP6 Kit (Ambion) and 0.5 nLwas microinjected into 2-4 cell stage zebrafish embryos. the specification and migration of CNCCs, although fate mapping studies will be required to substantiate these observations. To assess whether the phenotypes observed in the two affected individuals and zebrafish embryos might be the result of misfolding and/or mistrafficking of the protein, we studied green fluorescent protein (GFP)-tagged wildtype (p.Arg203) and mutant (p.Trp203) PACS1 in ARPE-19 cells. We examined the localization of the

113 CHAPTER 5

F igure 4 Continued

Trunk Head Arg203 i i i i - ! ♦ . ! ê

’ : % i b a ~ - , ' V ,

Trp203 i i i i i i i i i i i i i i

SC)X10::GFP

. Normal Craniofacial abnormalities

Arg203 Trp203 Arg203/Trp203

(B) Imaging of 4-day-old sox10::eGFP zebrafish larvae expressing either 50 pg WT or 50 pg mutant PACS1 RNA. Left panel; migration of eGFP labeled CNCCs. Right panel; CNCC migration phenotype soored in em bryos expressing WT PACS1, mutant PACS1 and both WT and m utant PACS1 com bined. constructs in cells grown to confluence. Analysis of —150 cells showed that 32% of cells with the mutant construct contained cytoplasmic GFP aggregates, a phenotype seen in <4% of cells with the wildtype construct (Figure 7A). Since aggregates of the mutant GFP-p.Trp203 PACS1 may be the result of misfolding, we next queried whether mutant PACS1 stability may also be perturbed. We transfected wildtype and mutant PACS1 constructs into HEK293FT cells (as transfection efficiency is >90%). While GFP-expression of wildtype PACS1 diminishes over time, expression of mutant PACS1 remains constant, indicating that the mutant protein remains more stable than wildtype protein (Figure 5A and 7B). Given the cellular aggregates phenotype coupled to defects in protein stability, we next asked

114 RECURRENT MUTATIONS IN PACS'I CAUSE A RECOGNIZABLE ID SYNDROME

F igure 5 Immunoblot of protein stability and flatmounting of zebrafish embryos

PACS1 Arg203-GFP PACS1 Trp203-GFP (kDa) IB: GFP

IB: GAPDH

CHX (hrs) 0 2 4

B Trp203 Arg203

• m r » I * * f ^ J i

'f 1 H u

(A) Representative immunoblot showing protein stability levels for wildtype (p.Arg203) and mutant (p.Trp203) PACS1. Lysates were obtained from 0, 2, 4, and 6 hrstreatment of cyolohexamide. Experiments were performed in triplicate. (B) Light micrograph images showing flatmounted, alcian blue stained 4-dpf embryos injected with wildtype or mutant PACS1. whether the mutant variant alters the formation of PACS1-dependent complexes. We observed no significant effect of mutant PACS1 on two known interactors, AP3D1 and CLCN7 (data not shown). However, we observed a significant phenotype for a third interactor, TRPV4. Specifically, we co-transfected GFP-tagged PACS1 with V5-tagged full length TRPV4 (TRPV4v1; NM_021625) and a smaller isoform of TRPV4 (TRPV4v2; NM_147204) that is missing residues 311-371, predicted to be encoding an ankyrin repeat. While both wild-type and mutant PACS1 bound to full-length TRPV4 at similar affinities, we detected significantly less TRPV4v2 in the immunoprecipitate with mutant PACS1 (Figure 7C). TPRV4 has been implicated in the migration of tumor endothelial cells285, in visceral mechanosensation286 as well as, more broadly, in the F-actin-mediated regulation of the shape of cellular surfaces287. It is unclear how TRPV4v2 participates in disease etiology, but we note that impairment of its known role in visceral mechanosensation in the gastrointesti-

115 CHAPTER 5

F ig u re 6 PACS1 mRNA levels in PACS1 injected zebrafish

CD O) C C 03 -O - -o C C/5(/) 2 2 £ * > Q- — o CD W

O**' o7 Sr r f f l ' .eP \\0 v\öd

Quantification of Sox10::GFP mRNA levels in PACS1 injected zebrafish. RNA isolated from trunk and head regions of wildtype (WT; C.607C) versus mutant (c.607C>T resulting in p.Arg203Trp) PACS1 injected zebrafish was used for real-time qPCR analysis. Student t-test was performed; *** p < 0.001. Error bars represent Standard deviation. nal tract may have contributed to the volvulus that individual 1 experienced in the neonatal period286. Further the binding of full-length TRPV4 to wild-type and mutant PACS1 is consistent with the lack of anosmia in individuals 1 and 2288-290. Our data suggest that the introduction of the p.Arg203Trp substitution triggers cytoplasmic aggregates, leads to protein trafficking defects and likely abrogates the ability of PACS1 to perform its normal function. Taken together, our data show that de novo mutations of PACS1 cause a hitherto unknown syndrome of intellectual disability in combination with distinct craniofacial features and genital abnormalities. The most parsimonious model is that of a dominant-negative mechanism that abrogates the ability of PACS1 to mediate the specification and migration of Sox10-positive cells in the neural crest. This in turn would perturb the migration of cells along the branchial arch, contributing to the striking craniofacial phenotype of our affected individuals. Our findings potentially implicate a splice isoform of TRPV4 in this process; however, the function of this isoform is not known, nor can we exclude that the mutation in the affected individuals also affects other PACS1 roles. Our findings highlight how the combination of detailed clinical phenotyping, unbiased genomic analysis, and functional dissection

116 RECURRENT MUTATIONS IN PACS1 CAUSE A RECOGNIZABLE ID SYNDROME

F ig u re 7 In vitro functional characterization of the p.Arg203Trp substitution in PACS1

Arg203 Trp203

1.2 ------Arg203 1.0 ------Trp203 o „ E j2 ra -E 0.8 T - D 8 £ 0.6 o -e 0.4 .S ra. ra ■S

Time after CHX treatment

PACS1 Arg203-GFP + + PACS1 Trp203-GFP TRPV4v1-V5 + TRPV4v2-V5 + (kDa)

IP: GFP IB: V5 - 9 5

IP: GFP IB: GFP - 9 5

(A) Localization of GFP-wild type (GFP-WT) and GFP-mutant (GFP-Trp203) PACS1 in transfected ARPE-19 cells grown to confluence and stained with a GFP antibody. ARPE-19 cells were grown in Dulbecco's Modified Eagle Medium and Ham’s F-12 Nutriënt 1:1 mixture (DMEM/F-12, Invitrogen) with 10% FBSand2 mM

117 CHAPTER 5

L-glutamine. Transfection of PACS1 WT and mutant plasmids was carried out using FuGene6 Transfection Reagent (Roche). Ceiis were fixed with 4% PFA 72hrs after transfection and probed with an anti-GFP antibody (Santa Cruz, sc-8334), foilowed by a secondary antibody Alexa Fluor 488 IgG (Invitrogen). (B) Quantification of WT and p.Trp203 PACS1 protein stability in transfected cells treated with CHX. Represented is the mean measurement of triplicate experiments, with the error bar representing the Standard error of the mean. FIEK 293FT cells were grown in Dulbecco's Modified Eagle Medium (DMEM, Invitrogen) containing 10% Fetal Bovine Serum (Invitrogen) and 2 mM L-glutamine (Invitrogen). Cells were treated with 50 mM Cycloheximide (Sigma) for 6 hours and harvested in co-IP buffer. (C) Immunoprecipitation of GFP-tagged WT and mutant PACS1 and V5-tagged TRPV4v1 (NM_021625) and TRPV4v2 (NMJ47204). HEK293 cells were transfected with tagged constructs and harvested in co-IP buffer after 48hrs. Immunoprecipitation was performed with an anti-GFP antibody and immunoblotted with a V5 antibody. of variants informs diagnosis and provides insight into fundamental biological processes such as migration of cranial neural crest cells.

Acknowledgements We are grateful to individuals 1 and 2 and their families for their support and cooperation. We thank J. de Ligt for bioinformatics support and data analysis. This work was funded in part by grants from the National Institutes of Health DK072301 and MH-084018 (to NK), the Dutch Organization for Health Research and Development (ZON-MW grants 916-86-016 to LELMV, 917-86-319 to BBAdV, 911-08-025 to JAV), the EU-funded TECHGENE project (Health-F5-2009-223143 to JAV), GENCODYS project (EU-7^-2010-241995 to HvB and BBAdV) and the Dutch Brain Foundation (APMB; 2010(1)-30 and BBAdV; 2009(1)-22). NK is a Distinguished Brumley Professor. This study was approved by the Medical Ethics Committee of the Radboud University Nijmegen Medical Centre, and all participants signed informed consent.

Web Resources The URLs for data presented herein are as follows: Exome Variant Server, NHLBI Exome Sequencing Project (ESP), Seattle, WA. URL: http://evs.gs.washington.edu/EVS/ [May 2012] BrainSpan: Atlas of the Developing Human Brain [Internet], Funded by ARRA Awards 1RC2MH089921-01, 1RC2MH090047-01, and 1RC2MH089929-01. © 2011. URL: http://developinghumanbrain.org. Online Mendelian Inheritance in Man. URL: http://www.omim.org

118 RECURRENT MUTATIONS IN PACS1 CAUSE A RECOGNIZABLE ID SYNDROME

5

119

CHAPTER 6

6.1 Howmany genes 123 6.2 Genes for recessive forms of ID 124 6.3 Genes for dominant forms of ID 128 6.4 Establishing pathogenicity of mutations 129 6.5 The missing mutations in ID: 'you will find only what you are looking for' 130 6.6 Massive parallel sequencing makes its introduction into clinical genetics 132 diagnostics 6.7 Treatment 134

12 2 DISCUSSION AND FUTURE PERPECTIVE

Discussion and future perspective

6.1 How many genes

The fast technological developments and decrease in experimental costs in the field of DNA sequencing since 2005 have led to an acceleration in gene identification of Mendelian disorders regardless of inheritance mode. This is demonstrated by the increasing number of mutations in intellectual disability (ID) genes that have been identified over the past five years: 18 in 2008, 14 in 2009, 17 in 2010, 35 in 2011 and already 45 until July 2012. In 2011, 16 of 35 genes had been identified by exome sequencing and another 4 by other Massive Parallel Sequencing (MPS) techniques (Chapter 1, Table 3 and Figure 6). Until July 2012, 22 of 45 novel ID genes have been identified by exome sequencing and 15 by other MPS techniques. Especially for autosomal dominant syndromic forms of ID, to which de novo single basepair mutations appear to contribute significantly, exome sequencing has demonstrated its power. As both costs for massive parallel sequencing (MPS) decrease and bioinformatics analysis and storage capacity increase, this will even allow for more efficiency, higher throughput of samples and better accuracy and sensitivity of sequencing resulting in a higher mutation detection yield. Because of the extreme genetic heterogeneity of ID, potential pathogenic mutations identified by MPS techniques are often observed in single families only110’ 115' 126' 131-134, 157 hampering clinical interpretation. In these cases it is essential to confirm pathogenicity either by (i) finding defects in the same genes in additional families with a similar clinical presentation, which will require large follow-up cohorts in case of non-syndromic ID and detailed phenotyping to collect follow-up cohorts for syndromic ID, and/or (ii) studies of the pathogenic nature of individual mutations in vivo and in vitro to get support for involvement of the respective gene in ID. More than 450 genes have been implicated in genetic conditions with ID as a clinical feature to date 291-13 and many more await discovery. Around one fifth of the known genes (-110 genes) map to the X-chromosome, representing the course of history for gene discovery in ID (Chapter 1, Figure 6). This overrepresentation of genes identified on specifically one of the 23 human chromosormes, the X-chromosome, is due to the ease to identify families with an X-linked inheritance pattern combined with the possibilities to investigate these families by means of linkage studies and define the region harboring the mutation. MPS techniques that asses the genome at base pair level now allow for an increase in the detection of mutations in autosomal genes that give rise to ID (Chapter 1, Figure 6). It is difficult to estimate the total number of genes involved in ID phenotypes and thus how many remain to be discovered. On the X-chromosome, —110 genes are estimated to contribute to

123 CHAPTER 6

ID160. This could be taken as starting point to make an estimate of the contribution of autosomal genes to ID phenotypes, although it is uncertain if the special charac- teristics of sex linked inheritance can so easily be extrapolated to autosomal inheritance patterns. Considering that 13% of X-chromosome protein-coding genes are implicated in X-linked ID, it would follow that there might be up to 2,500 autosomal genes invoived in monogenic ID (13% of all 18,942 autosomal protein-coding genes, Vega Genome Browser release 48; June 2012; http://vega. sanger.ac.uk/Homo_sapiens/lnfo/lndex).

6.2 Genes for recessive forms of ID

Autosomal ID gene identification, especially for non-syndromic ID, has been lagging behind as compared to the discovery of X-linked ID genes. This is mainly for two reasons: (i) autosomal dominant ID gene identification in sporadic syndromic and non-syndromic cases was mostly relying on copy number variation detection and (ii) autosomal recessive ID gene identification was limited to studies in large consanguineous families or clinically distinct recessive conditions that allowed for comparison of interfamilial mapping and sequencing data. We expanded the pool of families in which autosomal mutations of the recessive type can be identified, by investigating small and non-consanguineous Western sibling families. This is a group of patients that were previously difficult to assess from a research point of view as the small family size was hampering gene identification, In our clinic sibling families represent six percent of families with ID that seek genetic counseling. Families with only affected brothers are overrepresented as compared to families with only affected sisters or affected brothers and sisters and comprise around 45% of these sibling families. This is to be expected since in these families, both autosomal recessive and X-linked inheritance can contribute to the ID phenotype. The distribution of mild, moderate and severely affected individuals in the sibling families does not differ significantly from the distribution in isolated individuals with ID in our clinic (Chapter 1, Tabel 4). We applied homozygosity mapping under the assumption that in part of these families, homozygous mutations from a common ancestor would be the underlying molecular mechanism causing ID (Chapter 2)292. Initial results were promising, as they show overlap of regions of homozygosity in our sibling families with published loei for non-syndromic autosomal recessive ID and relatively long homozygous stretches (>5Mb in size) in two families. For the latter two families (ARMR1 and ARMR8), we performed targeted MPS of the coding sequence of homozygous regions of 11 Mb on chromosome 19 and 8.4 Mb on chromosome 6, respectively (unpublished data). This revealed one homozygous rare missense change, p.Cys616Ser, in transcription factor ZNF780A (NM_001142577.1)

124 DISCUSSION AND FUTURE PERPECTIVE

that disrupts the last C2H2-zinc finger domain of the ZNF780A protein in family ARMR1. This change is considered as potentially pathogenic. In family ARMR8 one homozygous rare missense change in MAP7 (NM JD01198608.1), p.Ala637Thr, was identified. The changed amino acid is not conserved and in silico prediction of the effect of the missense change on protein function indicates that this is a benign variant. The homozygosity mapping approach combined with targeted MPS was outpaced almost instantly by the availability of exome sequencing that allows for immediate identification of rare recessive variants of both the compound heterozygous as well as the homozygous type. Another advantage of exome sequencing over homozygosity mapping is the more or less unbiased approach to investigate the coding sequence on a genome wide scale, making it even possible to study brother pairs under two separate assumptions: an autosomal recessive and an X-linked inheritance model. We choose to further study sibling families by exome sequencing. Twenty families were selected amongst which four of the families described in Chapter 2 (ARMR2/W11-3400, ARMR6/W08-0748, ARMR8/ W06-0984 and ARMR10/W07-1585). Not all families from Chapter 2 were included because either insufficiënt material was available to perform exome sequencing, or because we preferred to study families in which also unaffected siblings were available for segregation testing. This genome wide study in small non-consan- guineous siblings with ID (Chapter 3), shows that the number of candidate mutations per individual exome is manageable: in one exome, candidate mutations in nine genes (on average) were detected (Chapter 3). This is around two times less candidates as compared to the analysis of an average exome of one individual from a consanguineous family (Z. Iqbal, personal communication). Segregation analysis revealed homozygous rare variants in seven genes, compound heterozygous rare variants in eight genes and hemizygous rare variants in six X-chromosomal genes. In the families investigated in Chapter 2 no segregating homozygous rare variants were detected, except for an additional homozygous rare variant in MAN2A1 (Chapter 3, Supplementary table 4). Systematic variant interpretation and subsequent classification of all 21 segregating rare variants, classified two homozygous variants, four heterozygous and three X-chromosomal variants as (possible) pathogenic, which correlates to (potentially) pathogenic mutations in zero to two (candidate) ID genes per family (n = 20 families). For the systematic interpretation of rare variants we generated a flow diagram that finally results in categorization of variants info three different classes: (i) pathogenic; mutations in known or novel genes, (ii) potentially pathogenic; rare variants that have high suspicion of being pathogenic based on the mutation and gene charac- teristics, but that have been identified in a single family only, and (iii) likely benign; rare variants that have high suspicion of being benign based on mutation and gene characteristics (Figure 1). Interpretation of pathogenicity of mutations on the

125 HPE 6 CHAPTER 126 F ig u re 1 Flow diagram for classification of rare variants in individuals with ID with individuals in variants rare of classification for diagram Flow 1 re u ig F Research Diagnostics DISCUSSION AND FUTURE PERPECTIVE

X-chromosome in brother pairs is relatively easy as compared to interpretation of autosomal mutations (dominant and recessive) since the majority of X-chromosomal ID genes are known. As a result, pathogenic mutations were identified (Chapter 3) in two known X-linked ID genes (SLC9A6 and SLC6A8) in two brother pairs and one novel autosomal recessive ID gene (DDHD2) in a brother and sister. For six other segregating recessive rare variants, amongst which one X-chromosomal variant in BCORL1, and five autosomal variants in MCM3AP, PTPRT, SYNE1, TDP2 and ZNF528, that were considered as potentially pathogenic, further studies are essential to prove pathogenicity. The above results show the potential that exome sequencing exhibits for detection of causative mutations (three out of 20 families) and candidate mutations (six out of 20 families) in small sibling families, and thus its value for identifying known and novel molecular defects underlying ID. It offers the families and their clinicians a considerable chance to obtain a molecular diagnosis and personalized genetic counseling. This study argues in favor of diagnostic application of exome sequencing in siblings, especially in families with affected brothers. The contribution of recessive mutations to disease pathology is obvious for families with multiple affected siblings. But also in sporadic individuals these mutations are expected to explain a substantial part of the population as empirical recurrence risk for siblings of an individual affected with ID of unknown cause have been reported to be as high as 7-12%110' 211>212 Because of the contribution of X-chromosomal genes, a higher recurrence rate in males with an affected male siblings is reported as compared to males or females with an affected female sibling: 7.5-9% recurrence risk if the index is a female versus 10-14% if the index is a male. In isolated individuals with severe ID up to 65% dominant de novo causes are found (Box 1). If we assume that severe ID is mostly monogenic in origin, it follows that there will be an estimated maximum autosomal recessive contribution

Box 1

Estimated maximum contribution of recessive mutations in sporadic severe ID: Structural chromosomal abnormalities: 20-30% De novo base pair mutations: -35% X chromosomal in males: 10% Estimated maximum contribution of recessive mutations if the index is a female: 100-65= 35% Estimated maximum contribution of recessive mutations if the index is a male: 100-65-10= 25%

127 CHAPTER 6

of 25% for sporadic affected males and 35% for sporadic females. This is in agreement with the empirical recurrence risk of 7-12%.

6.3 Genes for dominant forms of ID

Exome and whole genome sequencing provide the opportunity to study the human genome at single-nucleotide resolution, and have contributed to comprehend mutational processes in humans and their impact on disease. Family-based studies have shown that on average 74 germline single-nucleotide variants occur de novo in an individual genome, of which on average one affects the coding sequence128' 293. This implies an average human germline mutation rate of 1.18x10 8 for single- nucleotide variants and this number is close to previous estimates based on single gene studies that have been extrapolated to the whole genome294’295. This mutation rate however, can vary between nucleotide sites, depending on local sequence context and structure, exemplified by CpG sites that are high potential mutational targets128' 293' 294. It has become clear that germline single-nucleotide mutation rates are influenced by paternal age implying also that mutation rates between individuals may vary considerably128. So whereas whole chromosome events such as Down syndrome are often the result of meiotic defects in the female gamete with an increasing risk as maternal age goes up, single-nucleotide germline mutations appear similarly age-related but from the complementary gamete. Recent findings facilitated by exome sequencing suggest that germline single nucleotide variants, and more specifically those de novo variants in the coding part of the genome, play a prominent role in the etiology of both isolated syndromic and non-syndromic ID (Chapter 1). For every de novo event in a gene that has been observed in a single patiënt only, the same question applies for the mutations in the many candidate ID genes detected in single families with recessive ID: how should mutational consequences be interpreted in context of the observed phenotype? Only in the minority of genes, multiple de novo mutations in the same gene in separate cases have been identified134'159'296. This is inherent to the extreme geneticheterogeneous nature of the condition. Since several thousand genes are expected to cause ID, most of these will only explain a small percentage of the total ID population on its own. This hampers the identification of de novo events in the same gene in several individuals and requires large and carefully characterized follow-up cohorts for detection of multiple individuals with the same underlying genetic defect. In this thesis the rare occurrence of a recurrent de novo event is described within the PACS1 gene (Chapter 5). The study exemplifies several important aspects for studies of heterogenic conditions: the importance of detailed clinical phenotyping to select individuals with recognizable phenotypes for recurrent findings, the

128 DISCUSSION AND FUTURE PERPECT1VE

importance of sharing information within the clinical community to enlarge patiënt populations, and collaboration between research/diagnostic groups each with their own expertise to grasp both the genetics and pathophysiology underlying ID.

6.4 Establishing pathogenicity of mutations

Proving causality of mutations in candidate ID genes is essential for proper counseling of families. Detection of mutations in unrelated families with non-syndromic ID or a comparable syndromic ID phenotype will help to establish pathogenicity of novel ID genes. Cell biological assays that prove disruption of protein function by the mutation together with studies in (animal) model systems that show involvement of the respective gene in neuronal functioning and learning and memory might provide evidence for causation. As mentioned above, the individual contribution of most genes to the total ID population will be low. Large collections containing both detailed phenotypic information and DNA samples of individuals with ID will be required in order to find additional families harboring mutations in candidate ID genes. Such a cohort is currently available and being expanded in Nijmegen, where detailed phenotypic information of individuals with ID from all over the world are incorporated in a phenotype database. Data repositories as the Nijmegen phenotype database serve several purposes since they can be used to study large cohorts of individuals with non-specific ID but also because the annotation of detailed clinical features allows for selection and subsequent testing of patiënt populations with specific clinical features. Such a database has the potential for phenotype clustering and will possibly contribute to identification of novel syndromes. Another way to find additional families with mutations in candidate ID genes is by sharing genetic data, possibly via a database or weblog, where researchers can easily communicate and present their respective candidate mutations and genes. This will be useful since many of those genes in which a defect has only been found in one single family, will not be reported in peer-reviewed medical and scientific journals and hence are not disclosed to the scientific community. Set-up of data repositories that for example contain potentially relevant data derived from exome or whole genome sequencing as well as phenotypic information, will favor detection of multiple families with mutations in the same genes. Such an exchange platform of clinically relevant findings already exists for genomic copy number variations (CNVs), notably ECARUCA297 and DECIPHER298. Databases that collect genomic variation observed within the ‘healty’ population are already around. Examples are the Database of Genomic Variation, data derived from the 1000 genomes project, the Human Variome project, and the Exome Variant Server. These databases also

129 CHAPTER 6

greatly help mutation interpretation. Either the existing systems with CNV data need to be extended to incorporate potential pathogenic single-nucleotide variations, or new databases need to be initiated. In vivo and in vitro follow-up studies are necessary after gene identification to prove (i) effect of the mutation on protein activity and (ii) show involvement of the respective gene in neuronal and intellectual development and functioning. Type of assay or model system depends on gene characteristics. Initiation of such gene specific in vivo and in vitro follow-up studies will profit from data sharing. Collaborative international studies, such as the GENCODYS consortium, were human genetics and more fundamental research groups adapt their study focus to each other’s results, represent success that can be obtained by data sharing between research groups. Presenting findings during international meetings often contributes to initiation of collaboration. It is important not only to collect patiënt DNA, but also establish cell lines of easily accessible cell types such as leucocytes or fibroblasts to be able to study the effect of mutations within patiënt cells. Great promise for ID in this context are techniques that are under development to transform patiënt cells into pluripotent stem cells that in turn can be transformed into neuronal cells2" - 304. Studying cellular morphology of non-neuronal patiënt cells can help interpretation of the pathologie effect of a gene defect on cellular level (Chapter 4). Not all preferred assays and model systems will be easily accessible in the original labs where ID genes are being identified and therefore close collaborations between human genetics laboratories and more fundamental biological research groups will in many cases contribute to in depth studies and eventually provide evidence for involvement of the respective genes in ID.

6.5 The missing mutations in ID: ‘you will find only what you are looking for’

Recent studies in ID show that MPS, and particularly exome sequencing, greatly contributes to mutation detection in known and novel ID genes110'134 (Chapter 3-5). Mutations in the currently known X-linked genes allow for detection of causative mutations in up to 70% of families with an X-linked inheritance pattern160. These numbers are less accurate for sporadic cases as most studies focus on a selected population of individuals with moderate to severe ID that already had thorough clinical and molecular examination. Individuals with moderate to severe ID likely harbor a higher constitution of monogenic events and de novo chromosomal abnormalities than the majority of the isolated ID population that presents with mild ID. Studies in isolated moderate to severe ID indicate a maximum cumulative yield of —65% (20-30% detectable by methods for chromosomal aberrations such as

130 DISCUSSION AND FUTURE PERPECTIVE

array CGH and another 16-35% by exome sequencing)63-65' 93_96' 115’ 134' 305. Mutation detection yield for autosomal recessive ID was found to be 19-57%110 by studying consanguineous families. And our study in small sibling families, although of small sample size, confirms this yield as we identified (potentially) pathogenic mutations in up to 45% of these families (Chapter 3). These numbers look promising and are likely to increase in the coming years, but they also reflect the limitations of exome sequencing. Where are the remaining missing mutations? There are several plausible explanations both from a technical and an analytic origin: (i) exome sequencing targets the exonic sequences only, (ii) the experiment has laboratory imperfections, and (iii) bioinformatics imperfections, (iv) the annotation of the human genome is incomplete, (v) CNV analysis is frequently not performed, (vi) chromosomal rearrangements will be missed, and lastly (vii) the presumed inheritance model might be incorrect. Exome sequencing is a genome wide approach, but allows only for interrogation of the —1% coding sequence of the genome that is captured and sequenced. It is believed that the majority of mutations, ~85%306, will be located in the highly conserved protein coding region of the genome. Fifteen percent of defects located outside this area will be missed by exome sequencing. Whole genome sequencing and transcriptome sequencing will in the coming years provide information on the contributing part of the non-coding genome, that harbors amongst others regulatory elements, intronic sequences, nucleotide repeats and micro RNAs. Exome as well as whole genome experiments are still imperfect both from the laboratory and bioinformatics input. Part of the coding sequence fails to amplify, for example GC-rich areas -often the first exons of genes have a high GC content- and repeat sequences. A good example of this is ARX, one of the more frequently mutated X-chromosomal ID genes, which is GC-rich and very poorly covered by most exome platforms. Moreover, the mutational hotspot in this gene is not covered at all. Sequencing coverage of exome sequencing is not uniform and adequate coverage depth is needed to reliably call variants, although still 85% of the coding region is normally amplified with sufficiënt coverage (>10x)157. Furthermore, annotation of protein coding genes is not complete. Mutations could reside in this 'uncaptured' part of the exome. From the bioinformatics part, regions with high homology are computationally difficult to map to the human reference genome, especially in sequencing platforms that make use of shorter reads (50-100 base pairs). An example is SLC6A8 on the X-chromosome that shows >90% sequence homology with two regions on chromosome 16, resulting in incorrect mapping of reads originating from chromosome 16 to the X-chromosome and vice versa, thereby hampering mutation detection (Chapter 3)157. Indel variations are more difficult to both map and call than base pair substitutions. Read mapping and variant calling methods are under constant development and improvement.

131 CHAPTER 6

Re-analyzing exome data on a regular basis -e.g. once every year- by updated software might in some cases identify pathogenic variants that had been missed previously. Small copy number variations, not detected by array CGH, might explain part of the cases without potential pathogenic mutations. Copy number calling algorithms are not yet commonly used in exome data analysis. These software tools need improvement especially for calling of heterozygous deletions and duplications. Adding CNV calling to data analysis will improve the yield of pathogenic mutations by revealing (partial) gene and exon deletions/duplications, either in homozygous or (compound) heterozygous state as well as gene/exon duplications. Other chromosomal rearrangements that may also disrupt genes or influence tran- scriptional regulation, such as inversions and translocations will still mostly be missed, especially by sequencing methods that do not make use of paired end reads. Last but not least, ‘y°u find only what you are looking for’. Analyzing isolated cases under the hypothesis of a de novo event will fail to reveal recessive mutations, dominant inherited mutations, germline mosaicisms present in one of the parents, disease as a result of uniparental disomies, inherited X-linked mutations and polygenic inheritance. In mild ID, representing the majority of the total ID population, polygenic types of ID and dominant ID with incomplete penetrance may be more common than anticipated so far. When initial analysis does not reveal pathogenic mutations, a different inheritance model should be assumed. We are just about to understand the functional consequences of the immense genomic variation, and variants that we do now regard benign might turn out to contribute to disease - and the other way around. In the coming years, and most likely even decades, we still have much to learn from the immense genetic variation between humans and its clinical and biological implications.

6.6 Massive parallel sequencing makes its introduction into clinical genetics diagnostics

The developments in high throughput sequencing methods and massive parallel sequencing technologies have made exome sequencing technically feasible and cost-effective. As a consequence, exome sequencing is increasingly being explored and implemented as a diagnostic test for specific genetic diseases, appearing especially useful in the context of monogenetic disorders that are characterized by significant genetic and phenotypic heterogeneity, such as ID. Family-based exome sequencing as diagnostic test in ID has currently had its first pilot in 100 case-parent trios with isolated ID yielding adefinite pathogenic mutation in a total of 16% of studied cases, mostly comprising de novo mutations134. Potentially causative de novo mutations in novel ID genes were identified in another

132 DISCUSSION AND FUTURE PERPECTIVE

19% of these trios. The diagnostic yield for individuals with moderate to severe ID will improve even further when exome sequencing becomes the initial genetic test as in the pilot study individuals were previously clinically evaluated by a clinical geneticist to exclude known causes of ID and therefore do not represent the average ID patiënt population. The benefits of exome sequencing both for patients and for the health care system are obvious. For disorders caused by mutations in multiple genes, as for example non-syndromic X-linked ID for which > 35 genes have been reported, the current molecular diagnostic strategy involves essentially a one-by-one approach. This is a time consuming, expensive, and often incomplete process as usually only the frequently mutated genes are being analyzed. Families often live in uncertainty for many years. Exome sequencing is fast and provides the means to analyze all known disease genes at once -in the above example all non-syndromic X-linked ID genes- for rare single-nucleotide changes, small indels, and deletions and duplications. Because of its genome-wide character, incidental findings unrelated to the initial diagnostic question will occur. Such findings are comparable to incidental findings that may occur during investigations by for example the general practitioner, the radiologist, the surgeon or any other medical specialist and should be addressed as such when informing patients. Current studies estimate an incidence of incidental findings of —1% for exome sequencing307. Patients need to be informed about this chance of relevant findings that are unrelated to their initial question. To minimize the chance for patients to receive medically relevant information not related to the disorder under investigation, a diagnostic data analysis process specifically focusing on known disease genes should always precede genome wide investigations that have a research character. As new disease genes are constantly being discovered, the available exome data will allow for diagnostic re-analysis for variations in newly identified genes on a regular basis. For families seeking genetic counseling and for clinical geneticists, exome sequencing provides an accurate, relatively cheap and fast diagnostic test. Exome sequencing as a diagnostic test looks promising and pilot studies confirm this115' 115>126' 13M34 nevertheless several technical challenges remain to be addressed and improved. Further technical improvement in the efficacy and uniformity of exome capturing methods are needed to ensure that all targeted exons are captured and will be sequenced at a uniform coverage depth. The sensitivity and specificity of variant detection and especially the detection of small indel variations and deletions and duplications needs to be improved in order to reduce falls positive and falls negative rates. The above improvements will eventually improve the diagnostic yield. DNA sequencing is at the end of the diagnostic route in clinical genetic practice today. Exome and whole genome sequencing will change diagnostic routing in the

133 CHAPTER6

coming years and will become the initial genetic diagnostic test. It is inevitable that exome or genome sequencing will eventually be implemented as newborn screening. Noninvasive whole genome sequencing of cell-free fetal DNA in maternal plasma was successfully applied for a human fetus at 18.5 weeks of gestation308, paving the way for whole genome sequencing in prenatal diagnostics. Once affordable for the health care system it can be expected that non-invasive whole genome sequencing will be offered either for special indications -such as maternal and paternal age- or to all pregnant women in a similar way as is now the situation for non-invasive prenatal screening for trisomy 13, 18 and 21. The outlook of future application of exome and genome sequencing as newborn screening and in prenatal diagnostics provokes discussion about ethical implications. A public and open discussion between clinicians, molecular geneticists and parents will help to reach consensus which relevant findings should be returned to parents, to what purpose and on what indication prenatal testing should take place. A possible scenario for newborn screening can be as follows: information about disorders that need early treatment and untreatable childhood disorders will be returned. Information without direct implications in childhood will be shielded until the child reaches adulthood and can make a deliberate decision for himself whether he wants to obtain medically relevant genetic information. Clinical geneticists have the expertise to oversee the implications of genome wide newborn and prenatal screening and it is therefore their task to take the lead in the discussions.

6.7 Treatment

For decades, ID has been regarded a developmental 'hard wiring' problem of the brain and thereby was considered to be an untreatable condition. The growing insight into the biological consequences of genetic causes of cognitive impairment and subsequent therapeutic trails by use of animal models such as fruit fly and mouse, have shown a more positive perspective for treatability of ID conditions309’ 31°. For example in Fragile X syndrome, neurofibromatosis, tuberous sclerosis, Ru- binstein-Taybi, Angelman, Rett and Kleefstra syndrome, pharmacotherapeutic studies in animal models show reversibility of the cognitive phenotype even in adulthood309’ 31°. These results are encouraging and give hope for children and adults with ID and their families. The ultimate goal of such studies is to replace alleviating and supportive drug therapies for drug therapies that will (partly) reverse impairment in cognitive abilities. These therapies can be developed based on principle understanding of the biological processes underlying cognitive dysfunction. Nevertheless, the way to therapy is still a long process and requires several steps. It inevitably starts with clinical and genetic studies in man to define

134 DISCUSSION AND FUTURE PERPECTIVE

an ID phenotype and identify the underlying molecular defect. Next, animal models and other biological and neuroscientific studies provide insight into the biological pathways involved in pathogenesis. This knowledge allows for drug development and initiation of therapeutic trials in animal models. To complete the circle, clinical studies in man will (when successful) eventually lead to approval of the pharmaco- therapeutic agent as a treatment. The encouraging results in the conditions mentioned above underscore the importance of knowledge and understanding of biological pathways affected in ID. It clearly follows that (personalized) therapy development greatly relies on gene identification. Large collections of clinically well characterized individuals with ID, exome/whole genome sequencing to identify the underlying genetic defect, and cell biological and animal studies to comprehend the mutational consequences and biological pathways involved will in the coming years pave the way for development of novel treatment strategies and open the door for personalized medicine.

135

Summary Samenvatting Reference List Dankwoord | Acknowledgements Curriculum Vitae List of publications List of abbreviations Color figures

SUMMARY

Summary

The brain is one of the most complex and intriguing human organs and its development is a fine-tuned process crucial for its normal functioning. Aberrant brain development or interaction between brain cells can clinically present as intellectual disability (ID). ID is a common disorder affecting 1-3% of the general population and has major impact on personal life and society. ID is defined as a significant impairment in cognitive functioning and adaptive behavior, with an onset before 18 years of age, with consequently a decreased ability to adapt to daily life and social environment. Etiology of ID includes genetic, non-genetic and unknown causes. On the genetic level the defects leading to ID encompass a plethora of defects ranging from whole chromosome abnormalities to single base pair mutations. Due to genetic heterogeneity, about half of the individuals with a suspected genetic cause underlying ID do not receive a molecular diagnosis. The technologies used for molecular diagnosis to detect the genetic defects underlying ID are under constant development. The relatively recent introduction of massive parallel sequencing (MPS) technology has created a revolution in human genetics and the studies of Mendelian disorders, including ID. The main aim of this thesis is to identify the underlying genetic defect in families with ID of unknown origin (Chapter 1.6.2). The work described reflects the advantages and possibilities brought about by MPS technology, such as exome sequencing, for identification of causative mutations in families with ID of unknown origin. The presented studies also show the importance of (i) collecting large cohorts of clinically well characterized individuals with ID and (ii) follow-up studies in cell biological and animal models for establishing the pathogenicity of mutations identified by MPS technologies. Chapter 1 of this thesis covers a general introduction into genetic studies of ID. This chapter discusses the heterogeneous nature of ID and provides an overview of the different techniques and approaches used to unravel the genetics that underlie ID, presented in chronological order. The fast technological developments since the introduction of MPS and subsequent increase in discovery of pathogenic mutations in novel disease genes is also discussed. Chapter 2 describes an alternative use of homozygosity mapping. This method is frequently used to determine the genomic region that is likely to harbor the genetic defect underlying ID in consanguineous pedigrees. Here, we studied ten non-consanguineous families with two or more siblings affected with non-syndromic ID and applied SNP array genotyping to identify shared homozygous stretches in the genomes of sibling pairs affected with ID. Shared homozygous stretches (< 5 Mb) were identified in siblings of all families. Longer shared homozygous stretches (> 5 Mb) were observed in two families. None of the shared regions of homozygosity

139 SUMMARY

encompassed known non-syndromic ID genes, but overlap with four loei for non-syndromic autosomal recessive ID was observed. Although these results initially looked promising, this approach was outpaced quickly by the availability of exome sequencing. In Chapter 3, we choose to further investigate small and mostly non-consan- guineous families with two or more siblings affected with ID by exome sequencing. Twenty families were selected amongst which four of the families described in Chapter 2. Exome data analysis and subsequent segregation analysis revealed homozygous rare variants in seven genes, compound heterozygous rare variants in eight genes and hemizygous rare variants in six X-chromosomal genes. Systematic variant interpretation and subsequent classification of all 21 segregating rare variants, classified two homozygous variants in TDP2 and MCM3AP, eight compound heterozygous variants in DDHD2, PTPRT, SYNE1, ZNF528 and three X-chromosomal variants in SLC6A8, SLC9A6 and BCORL1 as (potentially) pathogenic. In this chapter we show the value of studying small sibling families for identification of ID genes since recessive pathogenic alleles could readily be identified by use of exome sequencing technology. Chapter 4 describes in depth studies of one of the potential novel ID genes, DDHD2, in which mutations were identified in chapter 3. In total, mutations in DDHD2 were identified in four unrelated families with a similar clinical presentation. Affected individuals represented with ID, early onset spastic paraplegia and a consistent pattern of structural and metabolic brain abnormalities composed of a marked thin corpus callosum combined with subtle periventricular white matter hy- perintensities and an abnormal cerebral proton magnetic resonance (MR) spectrum with a lipid peak showing highest intensity around the basal ganglia/thalamus area. The latter represents a useful distinguishing biomarker in clinical evaluation. To further investigate the role of DDHD2 in the central nervous system we performed in vivo studies of the synaptic terminals in the fruitfly, Drosophila melanogaster, with knockdown of Ddhd and observed a reduced number of active zones which represent the chemical synaps. The mutations in DDHD2 in the four families with a similar clinical presentation and the in vivo studies in Ddhd knockdown Drosophila models underline an essential role for DDHD2 in the central nervous system and implicate DDHD2 as a novel gene for spasticity with ID. In Chapter 5 two unrelated boys with ID and a striking facial resemblance were investigated by patient-parent trio exome sequencing. The striking similar facial dysmorphisms suggested a distinct dominant syndrome with possibly a common genetic defect. Remarkably, the same de novo base pair was identified in both boys in PACS1. To further understand the pathomechanism of the mutation, PACS1 was studied in vitro and in vivo. The mutant protein forms cytoplasmic aggregates in vitro and exhibits impaired binding to a specific isoform of one of its

140 SUMMARY

target proteins (TRPV4). In vivo studies in zebrafish embryos injected with mutant PACS1 mRNA suggest a dominant-negative effect and induced craniofacial abnormalities that are driven by aberrant specification of cranial neural crest cells. This study suggest that PACS1 is necessary for formation of craniofacial structures and that the identified mutation results in a specific syndromic ID phenotype. In Chapter 6 the implications of the use of MPS technology for pathogenic mutation identification in ID are discussed. Exome sequencing results have raised high expectations for implementation as a diagnostic test. The advantages as well at the challenges for the use of exome sequencing in a diagnostic setting are discussed. The heterogeneous genetic nature of ID requests often for follow-up studies to provide evidence for pathogenicity of unique mutations in novel genes. For these follow-up studies, large collections containing detailed phenotypic information and DNA samples of individuals with ID will be required as well as data sharing within the genetic community. As demonstrated in Chapters 4 and 5 also in vivo and in vitro follow-up studies are necessary (i) to prove effect of the mutation on protein activity and (ii) to show involvement of the respective gene in neuronal and/or intellectual development and functioning. Understanding the genetic defects underlying ID and the function of the genes invoived is essential for diagnosis, prognosis and genetic counseling and will eventually pave the way for development of treatment strategies.

141

SAMENVATTING

Samenvatting

Het brein is een complex en intrigerend menselijk orgaan. De ontwikkeling van de hersenen is een zeer verfijnd proces dat van cruciaal belang is voor het uiteindelijk normaal functioneren van het brein. Verstoringen tijdens de ontwikkeling van het brein of verstoorde interactie tussen hersencellen kunnen leiden tot verstandelijke beperking (VB). VB is met een prevalentie van 1-3% een veel voorkomende aandoening en heeft een grote persoonlijke en maatschappelijke impact. VB is een ontwikkelingsstoornis die wordt gekenmerkt door significante beperking van het cognitief functioneren en van adaptieve functies. De oorzaken van VB kunnen genetisch, niet genetisch en idiopathisch zijn. Verschillende genetische defecten, variërend van complete numerieke chromosomale afwijkingen tot veranderingen van één enkele basenpaar, kunnen tot VB leiden. Deze genetische variatie (heterogeniteit) maakt diagnostiek gecompliceerd. Slechts in de helft van de gevallen waar men een genetische oorzaak voor VB verwacht, wordt daadwerkelijk een genetisch defect gevonden. Gelukkig worden de opties voor diagnostiek steeds beter door de snelle ontwikkeling van nieuwe genetische technologieën voor mutatie detectie. Rond 2005 werd massive parallel sequencing (MPS) geïntroduceerd en deze technologie betekende een revolutie voor humane genetica, inclusief de studies naar de genetische oorzaken van VB. Het in dit proefschrift beschreven onderzoek heeft als doel om de onderliggende genetische oorzaak voor VB te vinden in individuen en families waar tot op heden met de beschikbare diagnostische mogelijkheden geen oorzaak is gevonden (Hoofdstuk 1.6.2). De resultaten die in de verschillende hoofdstukken worden beschreven weerspiegelen de impact en mogelijkheden die MPS technologie, zoals exoomsequencen (het in kaart brengen van het coderend gedeelte van het genoom), biedt voor het bestuderen van individuen met VB. Hierbij komt het belang aan bod van (i) het verzamelen van grote en klinisch nauwkeurig gedefinieerde patiënten cohorten en (ii) het doen van vervolgstudies in cel biologische- en dier- modellen om betrokkenheid van mutaties in deze genen bij cognitieve processen te bepalen. Hoofdstuk 1 van dit proefschrift bevat een algemene introductie over genetisch onderzoek bij individuen met een VB. Het heterogene karakter van VB wordt besproken en er wordt een chronologisch overzicht gegeven van de verschillende technieken en strategieën die worden toegepast ter opsporing van de onderliggende defecten. De snelle ontwikkelingen sinds de introductie van MPS en de daarop­ volgende toename in identificatie van ziektegenen voor VB worden eveneens besproken. In hoofdstuk 2 wordt een alternatieve toepassing van ‘homozygosity mapping’ beschreven. Deze strategie wordt veelal gebruikt om consanguïne families (families met bloedverwantschap) te bestuderen door homozygote genomische regios te

143 SAMENVATTING

detecteren waarin waarschijnlijk het genetisch defect ligt. In hoofdstuk 2 passen we dezelfde strategie toe in tien kleine, niet-consanguïne families met minimaal twee broers en/of zussen met niet-syndromale VB. Met behulp van SNP array genotypering werden, per familie, gezamenlijke homozygote gebieden opgespoord. In alle families werden zulke gezamenlijke homozygote gebieden (< 5 Mb) gevonden. In twee families waren grotere gezamenlijke homozygote regios (>5 Mb) aanwezig. Deze homozygote gebieden bevatten geen genen die al met niet-syndromale VB geassocieerd waren. Wel was er overlap met vier loei voor niet-syndromale autosomaal recessieve VB. Hoewel de resultaten bemoedigend waren, werd deze strategie vrijwel direct ingehaald door de mogelijkheden die exoomsequencing biedt. In hoofdstuk 3 worden twintig kleine, voornamelijk niet-consanguïne families met minimaal twee broers en/of zussen met VB onderzocht met behulp van exoom­ sequencing. Vier van de families waren ook beschreven in hoofdstuk 2. Analyse van de zeldzame genomische varianten die gedetecteerd werden met exoom­ sequencing en daaropvolgend segregatie analyse resulteerde in homozygote zeldzame varianten in zeven genen, compound heterozygote zeldzame varianten in acht genen en hemizygote zeldzame varianten in zes X-chromosomale genen. Na systematische interpretatie en classificatie werden twee homozygote varianten in TDP2 en MCM3AP, acht compound heterozygote varianten in DDHD2, PTPRT, SYNE1, ZNF528 en drie X-chromosomale varianten in SLC6A8, SLC9A6 and BCORL1 aangemerkt als (mogelijk) pathogeen. In dit hoofdstuk tonen we aan dat recessieve pathogene allelen relatief gemakkelijk geïdentificeerd kunnen worden in kleine families met meerdere aangedane broers en/of zussen met VB met behulp van exoomsequencing. Hoofdstuk 4 beschrijft een vervolg studie van één van de kandidaat VB genen, DD/-/D2, die in hoofdstuk 3 waren geïdentificeerd. Mutaties in DDHD2 werden gevonden in vier ongerelateerde families met een overeenkomstige klinische presentatie. Aangedane personen presenteerden zich met VB, spasticiteit die ontstaat voor het tweede levensjaar en een consistent patroon van structurele en metabole hersenafwijkingen bestaande uit een dun corpus callosum met subtiele perivertriculaire witte stof afwijkingen en een abnormaal proton magnetic resonance (MR) spectrum met een lipidepiek rondom de basale ganglia/thalamus regio. De lipidepiek in het MR spectrum vertegenwoordigt een onderscheidende biomarker in klinische evaluatie. In vivo studies van de terminale synaps in de fruitvlieg, Drosophila melanogaster, met knockdown van Ddhd bevestigden een rol voor DDHD2 in het centraal zenuwstelsel: er werd een verlaagd aantal actieve zones (ook wel de 'chemische synaps' genoemd) waargenomen in fruitvliegen met Ddhd knockdown. De mutaties in DDHD2 in vier families die zich presenteren met een vergelijkbaar klinisch beeld en de in vivo studies in de fruitvlieg met knockdown van

144 SAMENVATTING

Ddhd bevestigen beiden een essentiële rol voor DDHD2 in het centrale zenuwstelsel en definiëren een nieuw gen voor spasticiteit met VB. Hoofdstuk 5 beschrijft twee ongerelateerde jongens met VB en een opvallende gelijkenis in het gelaat. Hun opvallende overeenkomstige faciale dysmorfieën suggereerden een dominant syndroom met mogelijk een gemeenschappelijk genetisch defect. Exoomsequencing van beide jongens en hun ouders resulteerde opmerkelijk genoeg in de identificatie van een identieke de novo verandering in PACS1. In vitro studies van het mutant eiwit toonde vorming van aggregaten in cytoplasma alsmede ook een verminderde binding van mutant PACS1 eiwit aan een specifieke isovorm van één van zijn interactoren (TRPV4). In vivo onderzoek in zebravis embryos die werden geïnjecteerd met mutant PACS1 mRNA suggereerden een dominant-negatief effect van de mutatie. Tevens lieten mutant zebravissen embryos craniofaciale abnormaliteiten zien, veroorzaakt door een defect in migratie van neuronale lijst cellen. De resultaten van deze studie suggereren dat PACS1 betrokken is bij de formatie van craniofaciale structuren en dat de geïdentificeerde mutatie leidt tot een specifieke vorm van VB. Hoofdstuk 6 bespreekt de implicaties van het gebruik van MPS technologie voor opheldering van genetische oorzaken van VB. De reeds gepubliceerde resultaten die zijn verkregen met behulp van exoomsequencing wekken hoge verwachtingen voor de implementatie van exoomsequencing in een diagnostische setting. De voordelen, maar ook de uitdagingen die ons te wachten staan wanneer exoomse­ quencing geïmplementeerd wordt in de diagnostische work-up van individuen met VB, worden besproken. Door het heterogene genetische karakter van VB zijn vaak vervolgstudies nodig om causaliteit van unieke mutaties in nieuwe VB genen te bewijzen. Voor deze vervolgstudies is het noodzakelijk om gedetailleerde klinische informatie en DNA te verzamelen van grote aantallen individuen met VB. Daarnaast is het onderling delen van genetische data door de genetici en onderzoekers wereldwijd wenselijk. Hoofdstukken 4 en 5 laten zien dat bovendien in vivo en in vitro vervolgstudies nodig zijn om (i) het effect van mutaties op eiwit activiteit en (ii) de betrokkenheid van het respectievelijke gen in neuronale ontwikkeling èn intellectuele ontwikkeling en functioneren te bevestigen. Het begrijpen van de genetische defecten die ten grondslag liggen aan VB en van de functie van de betrokken genen is uiteindelijk essentieel voor risico bepaling, diagnose, prognose, en het ontwikkelen van behandelmethoden.

145

REFERENCE LIST

Reference List

1. Leonard.H. & Wen.X. The epidemiology of mental retardation: challenges and opportunities in the new millennium. Ment. Retard. Dev. Disabil. Res. Rev. 8, 117-134 (2002). 2. Schalock.R.L. et al. The renaming of mental retardation: understanding the change to the term intellectual disability. Intellect. Dev. Disabil. 45, 116-124 (2007). 3. World Health Organization. the International classification of Disease and Related Health Problems (ICD-10). (1994). 4. American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition - Text Revision (DSMIV-TR). (2000). 5. Schalock.R.L. et al. AAIDD's 11th Edition of Intellectual Disability: Definition, Classification, and Systems of Support. American Association on Intellectual and Developmental Disabilities (2012). 6. Deary.l.J. & Batty.G.D. Cognitive epidemiology. J. Epidemiol. Community Health 61 378-384 (2007). 7. Zigler.E. Familial mental retardation: a continuing dilemma. Science 155, 292-298 (1967). 8. Roeleveld,N., Zielhuis.G.A., & Gabreels.F. The prevalence of mental retardation: A critical review of recent literature. Developmental Medicine and Child Neurology 39, 125-132 (1997). 9. Yeargin-Allsopp,M., Murphy.C.C., Oakley,G.R, & Sikes,R.K. A multiple-source method for studying the prevalence of developmental disabilities in children: the Metropolitan Atlanta Developmental Disabilities Study. Pediatrics 89, 624-630 (1992). 10. Lehrke.R. Theory of X-linkage of major intellectual traits./Am. J. Ment. Delic. 76, 611-619 (1972). 11. Lehrke.R.G. X-linked mental retardation and verbal disability. Birth Delects Orig.Artic. Ser. 10,1-100 (1974). 12. Gecz,J., Shoubridge,C., & Corbett.M. The genetic landscape of intellectual disability arising from chrom osom e X. Trends Genet. 25, 308-316 (2009). 13. Ropers.H.H. Genetics of early onset cognitive impairment. Annu. Rev. Genomics Hum. Genet. 11, 161-187 (2010). 14. McKusick-Nathans Institute of Genetic Medicine,J.H.U.B.M. Online Mendelian Inheritance in Man, OMIM'® (2012). 15. van Bokhoven,B H. Genetic and epigenetic networks in intellectual disabilities. Annu. Rev. Genet. 45, 81-104 (2011). 16. Wilska,M. & Kaski,M. Aetiology of intellectual disability—the Finnish classification: development of a method to incorporate WHO ICD-10 coding. J. Intellect. Disabil. Res. 43 ( Pt 3), 242-250 (1999). 17. Rauch.A. et al. Diagnostic yield of various genetic approaches in patients with unexplained developmental delay or mental retardation. American Journal of Medical Genetics Part A 140A 2063-2074 (2006). 18. de Vries,B.B. et al. Screening and diagnosis for the fragile X syndrome among the mentally retarded: an epidemiological and psychological survey. Collaborative Fragile X Study Group, Am. J. Hum. Genet. 61, 660-667 (1997), 19. Down.J.L. Observations on an ethnic classification of idiots. 1866. Ment. Retard. 33, 54-56 (1995). 20. Lejeune.J., Gautier,M., & Turpin.R. Etude des chromosomes somatiques de neuf enfants mongoliens. C. fl. Acad. Sci. 248, 1721-1722 (1959). 21. Egan.J.F. et al. Down syndrome births in the United States from 1989 to 2001. Am. J. Obstet. Gynecol. 191, 1044-1048 (2004). 22. Morris,J.K. & Alberman.E. Trends in Down’s syndrome live births and antenatal diagnoses in England and Wales from 1989 to 2008: analysis of data from the National Down Syndrome Cytogenetic Register. BMJ 339, b3794 (2009). 23. Edwards.J.H. et al. A newtrisomic syndrome. Lancet 1, 787-790 (1960). 24. Patau.K., et al.. Multiple congenital anomaly caused by an extra autosome. Lancet 1, 790-793 (1960). 25. Classic pages in obstetrics and gynecology by Henry H. Turner. A syndrome of infantilism, congenital webbed neck, and oubitus valgus. Endocrinology, vol. 23, pp. 566-574, 1938. Am. J. Obstet. Gynecol. 113. 279 (1972). 26. Ford.C.E. et al.. A sex-chromosome anomaly in a case of gonadal dysgenesis (Turner’s syndrome). Lancetl, 711-713 (1959).

147 REFERENCE LIST

27. Klinefelter.HReifenstein.E., & Albright.F. Syndrome characterized by gynecomastia, aspermatogen- esis without a-Leydigism and increased excretion of follicle-stimulating hormone. Clin Endocrinol Metab. 2(11), 615-624 (1942). 28. Jacobs,RA. & Strong.J.A. A case of human intersexuality having a possible XXY sex-determining mechanism. Nature 183, 302-303 (1959). 29. Lejeune.J. et al. Trois ca de deletion partielle du bras court d’un chromosome 5. C. R. Acad. Sci. 257, 3098 (1963). 30. Hirschhorn.K., Cooper,H.L., & FirscheinJ.L. Deletion of short arms of chromosome 4-5 in a child with defects of midline fusion. Humangenetik. 1, 479-482 (1965). 31. Wolf,U., Reinwein.H., Porsch.R., Schroter.R., & ESaitsch.H. [Deficiency on the short arms of a chrom osom e No. 4], Humangenetik. 1, 397-413 (1965). 32. Lubs.H.A. A marker X chrom osom e. Am. J. Hum. Genet. 21, 231-244 (1969). 33. Giraud.F., Ayme.S., Mattei,J.F., & Mattei.M.G. Constitutional chromosomal breakage. Hum. Genet. 34, 125-136 (1976). 34. Flarvey.J., Judge.C., & Wiener,S. Familial X-linked mental retardation with an X . J. Med. Genet. 14 46-50 (1977). 35. Sutherland.G.R. Fragile sites on human chromosomes: demonstration of their dependence on the type of tissue culture medium. Science 197, 265-266 (1977). 36. Martin,J.P. & Bell,J. A pedigree of mental defect showing sex-linkage. J. Neurol. Psychiat. 6, 154-157. (1943). 37. Kremer.E.J. et al. Mapping of DNA instability at the fragile Xto atrinucleotide repeat sequence p(CCG) n. Science 252, 1711-1714 (1991). 38. Flagerman,P.J. The fragile X prevalence paradox. J. Med. Genet. 45, 498-499 (2008). 39. Turner,G., Webb.T., Wake.S., & Robinson.H. Prevalence of fragile X syndrome. Am. J. Med. Genet. 64, 196-197 (1996). 40. Phelan MC, Crawford EC, & Bealer DM. Mental retardation in South Carolina. 45-60. Saul RA, Phelan MC (eds) Proceedings of the Greenwood Genetic Center. Greenwood Genetic Center, Greenwood, SC,Saul RA, Phelan MC (eds) (1996). 41. Schinzel A. Catalogue of unbalanced chromosome aberrations in man. 2nd ed. Walter de Gruyter, Berlin (2001). 42. Williams,J.C., Barratt-Boyes.B.G., & Lowe.J.B. Supravalvular aortic stenosis. Circuiation 24,1311-1318 (1961). 43. Beuren,A.J., Apitz.J., & Flarmjanz.D. Supravalvular aortic stenosis in association with mental retardation and a certain facial appearance. Circuiation 26, 1235-1240 (1962). 44. Ewart.A.K. et al. Hemizygosity at the elastin locus in a developmental disorder, Williams syndrome. Nat. Genet. 5, 11-16 (1993). 45. Angelman,H. ‘Puppet children’: a report of three cases. Dev. Med. Child Neurol.7, 681-688 (1965). 46. Magenis,R.E., Brown,M,G., Lacy.D.A., Budden.S., & LaFranchi.S. Is Angelman syndrome an alternate result of del(15)(q11q13)?/Am. J. Med. Genet. 28, 829-838 (1987). 47. Prader.A., Labhart.A., & Willi.H. Ein Syndrom von Adipositas, Kleinwuchs, Kryptorchismus und Oligophrenie nach Myatonieartigem Zustand im Neugeborenenalter. Schweiz. Med. Wschr. 86, 1260-1261 (1956). 48. Ledbetter.D.H. et al. Deletions of chromosome 15 as a cause of the Prader-Willi syndrome. N. Engl. J. Med. 304, 325-329 (1981). 49. DiGeorge.A.M. Congenital absence of the thymus and its immunologie consequences: concurrence with congenital hypoparathyroidism. Birth Defects Orig. Art. Ser. IV. 1, 116-121 (1968). 50. de la.C.A., Herva.R., Koivisto.M., & Aula.P. A deletion in chromosome 22 can cause DiGeorge syndrome. Hum. Genet. 57, 253-256 (1981). 51. de Vries,B.B.A., Winter,R„ Schinzel,A„ & van Ravenswaaij-Arts.C. Telomeres: a diagnosis at the end of the chromosomes. Journal of Medical Genetics 40, 385-398 (2003). 52. van Bon.B.W.M. General introduction and outline for this thesis. Ipskamp Drukkers, Enschede, The Netherlands 1, 17-97. (2010). 53. Kleefstra.T. et al. Disruption of the gene Euchromatin Histone Methyl Transferasel (Eu-FIMTase1) is associated with the 9q34 subtelomeric deletion syndrome. Journal of Medical Genetics 42,299-306 (2005).

148 REFERENCE LIST

54. Kleefstra,T. et al. Loss-of-function mutations in euchromatin histone methyl transferase 1 (EHMT1) cause the 9q34 subtelomeric deletion syndrome. Am. J. Hum. Genet. 79, 370-377 (2006). 55. Bonaglia.M.C. et al. Disruption of the ProSAP2 gene in a t(12;22)(q24.1 ;q13.3) is associated with the 22q13.3 deletion syndrome. Am. J. Hum. Genet. 69, 261-268 (2001). 56. Sarasua.S.M. et al. Association between deletion size and important phenotypes expands the genomic region of interest in Phelan-McDermid syndrome (22q13 deletion syndrome). J. Med. Genet. 48, 761-766 (2011). 57. Flint.J. et al. The detection of subtelomeric chromosomal rearrangements in idiopathic mental retardation. Nat. Genet. 9, 132-140 (1995). 58. Knight.S.J. et al. Subtle chromosomal rearrangements in children with unexplained mental retardation. Lancet 354, 1676-1681 (1999). 59. Kooien,D.A. et al. Screening for subtelomeric rearrangements in 210 patients with unexplained mental retardation using multiplex ligation dependent probe amplification (MLPA). Journal of Medical Genetics 41, 892-899 (2004). 60. Vissers,L.E.L.M. et al. Submicroscopic chromosomal abnormalities and large-scale polymorphisms in mental retardation patients detected by genome wide microarray analysis. American Journal of Human Genetics 73, 426 (2003). 61 Schoumans.J. et al. Detection of chromosomal imbalances in children with idiopathic mental retardation by array based comparative genomic hybridisation (array-CGH). J. Med. Genet. 42 699-705 (2005). 62. Menten.B. et al. Emerging patterns of cryptic chromosomal imbalance in patients with idiopathic mental retardation and multiple congenital anomalies: a new series of 140 patients and review of published reports. J. Med. Genet. 43, 625-633 (2006). 63. Kooien,D.A. et al. Genomic Microarrays in Mental Retardation: A Practical Workflow for Diagnostic Applications. Human Mutation 30, 283-292 (2009). 64. de Vries,B.B.A. et al. Diagnostic genome profiling in mental retardation. American Journal of Human Genetics 77, 606-616 (2005). 65. Shaw-Smith,C. et al. Microarray based comparative genomic hybridisation (array-CGH) detects submicroscopic chromosomal deletions and duplications in patients with learning disability/mental retardation and dysmorphic features. J. Med. Genet. 41, 241-248 (2004). 66. Hehir-Kwa,J.Y. From copy number identification to copy number interpretation. Gildeprint drukkerij. (2010). 67. Kooien,D.A. Copy number variation and mental retardation. Ponsen & Looijen B.V. (2008) 68. Mefford.H.C. et al. Recurrent rearrangements of chromosome 1 q21.1 and variable pediatrie phenotypes. New Engiand Journal of Medicine 359, 1685-U130 (2008). 69. Van der Aa.N. et al. Fourteen new cases contributeto the characterization of the 7q11.23 microduplica- tion syndrome. European Journal of Medical Genetics 52, 94-100 (2009). 70. Ba!ciuniene,J. et al. Recurrent 10q22-q23 deletions: a genomic disorder on 10q associated with cognitive and behavioral abnormalities. Am. J. Hum. Genet. 80, 938-947 (2007). 71. van Bon,B.W. et al. The phenotype of recurrent 10q22q23 deletions and duplications. Eur. J. Hum. Genet. 19, 400-408 (2011). 72. Ben-Shachar,S. et al. Microdeletion 15q13.3: a locus with incomplete penetrance for autism, mental retardation, and psychiatrie disorders. J. Med. Genet. 46, 382-388 (2009). 73. Sharp,A.J. et al. A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nat. Genet. 40, 322-328 (2008). 74. van Bon.B.W.M. et al. Further delineation of the 15q13 microdeletion and duplication syndromes: a clinical spectrum varying from non-pathogenic to a severe outcome. Journal of Medical Genetics 46, 511-523 (2009). 75. Ballif.B.C. et al. Discovery of apreviously unrecognized microdeletion syndrome of 16p11.2-p12.2. Nat. Genet. 39, 1071-1073 (2007). 76. Battaglia,A., Novelli.A., Bernardini.L., Igliozzi.R., & Parrini.B. Further characterization of the new microdeletion syndrome of 16p11.2-p12.2. Am. J. Med. Genet. A 149A, 1200-1204 (2009), 77. Patil.S.R. & Bartley.J.A. Interstitial deletion of the short arm of chromosome 17. Hum. Genet. 67. 237-238 (1984).

149 REFERENCE LIST

78. Smith.A.C. et al. Interstitial deletion of (17)(p11.2p11.2) in nine patients. Am. J. Med. Genet. 24, 393-414 (1986). 79. Brown.A. et al. Two patients with duplication of 17p11.2: the reciprocal of the Smith-Magenis syndrome deletion? Am. J. Med. Genet. 63, 373-377 (1996). 80. Potocki.L. et al. Molecular mechanism for duplication 17p11.2- the homologous recombination reciprocal of the Smith-Magenis microdeletion. Nat. Genet. 24, 84-87 (2000). 81. Kooien,D.A. et al. A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism. Nat. Genet. 38, 999-1001 (2006). 82. Kooien,D.A. et al. Clinical and molecular delineation of the 17q21.31 microdeletion syndrome. J. Med. Genet. 45, 710-720 (2008). 83. van Bon,B.W. et al. The 2q23.1 microdeletion syndrome: clinical and behavioural phenotype. Eur. J. Hum. Genet. 18, 163-170 (2010). 84. Palomares.M. et al. Characterization of a 8q21.11 microdeletion syndrome associated with intellectual disability and a recognizable phenotype. Am. J. Hum. Genet. 89, 295-301 (2011). 85. Amiel.J. et al. Mutations in TCF4, encoding a class I basic helix-loop-helix transcription factor, are responsible for Pitt-Hopkins syndrome, a severe epileptic encephalopathy associated with autonomie dysfunction. Am. J. Hum. Genet. 80, 988-993 (2007). 86. Zweier.C. et al. Haploinsufficiency of TCF4 causes syndromal mental retardation with intermittent hy- perventilation (Pitt-Hopkins syndrome). Am. J. Hum. Genet. 80, 994-1001 (2007). 87. Kulharya.A.S., Michaelis.R.C., Norris.K.S., Taylor.H.A., & Garcia-Heras,J. Constitutional del(19) (q12q13.1) in a three-year-old girl with severe phenotypic abnormalities affecting multiple organ systems. Am. J. Med. Genet. 77, 391-394 (1998). 88. Malan.V. et al. 19q13.11 deletion syndrome: a novel clinically recognisable genetic condition identified by array comparative genomic hybridisation. J. Med. Genet. 46, 635-640 (2009). 89. Schuurs-Hoeijmakers,J.H. et al. Refining the critical region of the novel 19q13.11 microdeletion syndrome to 750 Kb. J. Med. Genet. 46, 421-423 (2009). 90. Kooien,D.A. et al. Molecular characterisation of patients with subtelomeric 22q abnormalities using chromosome specific array-based comparative genomic hybridisation. European Journal of Human Genetics 13, 1019-1024 (2005). 91. Phelan.M.C. et al. 22q13 deletion syndrome. Am. J. Med. Genet. 101, 91-99 (2001). 92. Conrad.D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704-712 (2010). 93. Vissers, L.E.L.M. et al. Array-based comparative genomic hybridization for the genomewide detection of submicroscopic chromosomal abnormalities. American Journal of Human Genetics 73, 1261-1270 (2003). 94. Veltman.J.A. Genomic microarrays in clinical diagnosis. Curr. Opin. Pediatr. 18, 598-603 (2006). 95. Knight.S.J. & Regan.R. Idiopathic learning disability and genome imbalance. Cytogenet. Genome Res. 115, 215-224 (2006). 96. Miiler.D.T. et al. Consensus statement: chromosomal microarray is afirst-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 86, 749-764 (2010). 97. Schuurs-Hoeijmakers.J.H.M. et al. Refining the critical region of the novel 19q13.11 microdeletion syndrome to 750 Kb. Journal of Medical Genetics 46, 421-423 (2009). 98. Gana,S .et al. 19q13.11 cryptic deletion: description of tw o new cases and indication for a role of WTIP haploinsufficiency in hypospadias. Eur. J. Hum. Genef.(2012). 99. Vissers,L.E.L.M. et al. Mutations in a new member of the chromodomain gene family cause CHARGE syndrome. Nature Genetics 36, 955-957 (2004). 100. Kooien,D.A. etal. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat. Genet. 44, 639-641 (2012). 101. Zollino.M. etal. Mutations in KANSL1 cause the 17q21.31 microdeletion syndrome phenotype. Nat. Genet. 44, 636-638 (2012). 102. Bitties,A.H. A community genetics perspective on consanguineous marriage. Community Genet. 11 324-330 (2008).

150 REFERENCE LIST

103. Ropers.H.H. & Hamel,B.C.J. X-linked mental retardation. Nature Reviews Genetics 6, 46-57 (2005). 104. Ropers.FI.FI. X-linked mental retardation: many genes for a complex disorder. Curr. Opin. Genet. Dev. 16, 260-269 (2006). 105. de Brouwer,A.P. et al. Mutation frequencies of X-linked mental retardation genes in families from the EuroMRX consortium. Hum. Mutat. 28 207-208 (2007). 106. Tarpey.P.S. et al. A systematic, large-scale resequencing screen of X-chromosome coding exons in mental retardation. Nature Genetics 41, 535-543 (2009). 107. Najm abadi.H. et al. Flomozygosity mapping in consanguineous families reveals extreme heterogeneity of non-syndromic autosomal recessive mental retardation and identifies 8 novel gene loei. Hum. Genet 121, 43-48 (2007). 108. Ropers,H.H. Genetics of intellectual disability. Curr. Opin. Genet. Dev. 18; 241-250 (2008). 109. Floischen.A. et al. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet. 42. 483-485 (2010). 110. Najmabadi.FI. et al. Deep sequencing reveals 50 novel genes for recessive cognitive disorders. Nature 478, 57-63 (2011). 111. Ng.S.B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790-793 (2010). 112. Riviere,J.B. et al. De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter syndrome. Nat. Genet. 44, 440-442 (2012). 113. Tsurusaki.Y. et al. Mutations affecting components of the SWI/SNF complex cause Coffin-Siris syndrome. Nat. Genet. 44, 376-378 (2012). 114. van Bon.B.W. et al. Cantu Syndrome Is Caused by Mutations in ABCC9. Am. J. Hum. Genet 90, 1094-1101 (2012). 115. Vissers,L.E. et al. Ade novo paradigm for mental retardation. Nat. Genet. 42, 1109-1112 (2010). 116. Ng.S.B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272-276 (2009). 117. Ng.S.B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42,30-35 (2010). 118. Doherty.D. & Bamshad.M.J. Exome sequencing to find rare variants causing neurologie diseases. Neurology( 2012). 119. Robinson.P.N. Whole-exome sequencing forfinding de novo mutations in sporadic mental retardation. Genome Biol. 11,144 (2010). 120. Hoischen.A. et al. De novo nonsense m utations in ASXL1 cause Bohring-O pitz syndrome. Nat. Genet. 43, 729-731 (2011). 121. Bienengraeber.M. et al. ABCC9 mutations identified in human dilated cardiomyopathy disrupt catalytic KATP channel gating. Nat. Genet. 36, 382-387 (2004). 122. Harakalova.M. et al. Dominant missense mutations in ABCC9 cause Cantu syndrome. Nat. Genet. 44, 793-796 (2012). 123. Rauch,A. et al. Range of genetic mutations associated with severe non-syndromic intellectual disability: an exome sequencing study. Lancet [Epub ahead of print] (2012). 124. Kirov.G. et al. De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenosis of schizophrenia. Mol. Psychiatry 17, 142-153 (2012). 125. Taikowski.M.E. et al. Sequencing chromosomal abnormalities reveals neurodevelopmental loei that confer risk across diagnostic boundaries. Cell 149, 525-537 (2012). 126. O ’Roak.B.J. et al. Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat. Genet. 43, 585-589 (2011). 127. Santen,G.W. et al. Mutations in SWI/SNF chromatin remodeling complex gene ARID1B cause Coffin-Siris syndrome. Nat. Genet. 44, 379-380 (2012). 128. Conrad.D.F. et al. Variation in genome-wide mutation rates within and between human families. Nat. Genet. 43, 712-714 (2011). 129. Lynch,M. Rate, molecular spectrum, and consequences of human mutation. Proc. Natl. Acad. Sci. U. S.A 107, 961-968(2010). 130. Roach.J.C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636-639 (2010).

151 REFERENCE LIST

131. lossifov.l. et al. De novo gene disruptions in children on the autistic spectrum. Neuron 74, 285-299 (2012). 132. Neale.B.M. et al. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242-245 (2012). 133. Sanders,S.J. et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237-241 (2012). 134. de Ligt,J. et al. Diagnostic Exome Sequencing in Persons with Intellectual Disability. New England Journal of Medicine [Epub ahead of printing] (2012). 135. Caliskan.M. et al. Exome sequencing reveals a novel mutation for autosomal recessive non-syndromic mental retardation in the TECR gene on chromosome 19p13. Hum. Mol. Genet. 20, 1285-1289 (2011). 136. Abou.J.R. et al. Adaptor protein complex 4 deficiency causes severe autosomai-recessive intellectual disability, Progressive spastic paraplegia, shy character, and short stature. Am. J. Hum. Genet. 88, 788-795 (2011). 137. Hussain.M.S. et al. A truncating mutation of CEP135 causes primary microcephaly and disturbed centrosomal function. Am. J. Hum. Genet. 90, 871-878 (2012). 138. Martinez,F.J. et al. Whole exome sequencing identifies a splicing mutation in NSUN2 as a cause of a Dubowitz-like syndrome. J. Med. Genet. 49, 380-385 (2012). 139. Puffenberger.E.G. et al. Genetic mapping and exome sequencing identify variants associated with five novel diseases. PLoS. One. 7, e28936 (2012). 140. Aldahm esh.M .A. et al. Recessive mutations in ELOVL4 cause ichthyosis, intellectual disability, and spastic quadriplegia. Am. J. Hum. Genet. 89, 745-750 (2011). 141. Krawitz.P.M. et al. Identity-by-descent filtering of exome sequence data identifies PIGV mutations in hyperphosphatasia mental retardation syndrome. Nat. Genet. 42, 827-829 (2010), 142. Krawitz.P.M. et al. Mutations in PIGO, a Member of the GPI-Anchor-Synthesis Pathway, Cause Hyper­ phosphatasia with Mental Retardation. Am. J. Hum. Genet. 91(1), 146-51 (2012). 143. Topper,S., Ober,C„ & Das.S. Exome sequencing and the genetics of intellectual disability. Clin. Genet. 80, 117-126 (2011). 144. Bamshad,M,J. et al. Exome sequencing as a tooi for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745-755 (2011). 145. Veltman.J.A. & Brunner,H.G. De novo mutations in human genetic disease. Nat. Rev. Genet. 13, 565-575 (2012). 146. Ku,C.S. et al. Anew paradigm emerges from the study of de novo mutations in the context of neurode- velopmental disease. Mol. Psychiatry [Epub ahead of print] (2012). 147. Bamshad.M.J. et al. Exome sequencing as a tooi for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745-755 (2011). 148. Teer.J.K. & Mullikin,J.C. Exome sequencing: the sweet spot before whole genomes. Hum. Mol. Genet. 19, R145-R151 (2010). 149. Nelen.M. & Veltman.J.A. G enom e and exome sequencing in the clinic: unbiased genom ic approaches with a high diagnostic yield. Pharmacogenomics. 13, 511-514 (2012). 150. Bartley.J.A. & Hall.B.D. Mental retardation and multiple congenital anomalies of unknown etiology: frequency of occurrence in similarly affected sibs of the proband. Birth Defects Orig. Artic. Ser. 14, 127-137 (1978). 151. Chelly.J., Khelfaoui,M., Francis,F., Cherif.B., & Bienvenu.T. Genetics and pathophysiology of mental retardation. European Journal of Human Genetics 14, 701-713 (2006). 152. Priest,J.H., Thuline.H.C., Laveck,G.D., & Jarvis,D.B. An Approach to Genetic-Factors in Mental-Retar- dation - Studies of Families Containing at Least 2 Siblings Admitted to A State Institution for the Retarded. American Journal of Mental Deficiency 66. 42-50 (1961). 153. Wright.S.W., Tarjan.G., & Eyer.L. Investigation of Families with 2 Or More Mentally Defective Siblings - Clinical Observations. Ama Journal of Diseases of Children 97, 445-463 (1959). 154. bd EI-Aziz,M.M. et al. EYS, encoding an ortholog of Drosophila spacemaker, is mutated in autosomal recessive retinitis pigmentosa. Nature Genetics 40, 1285-1287 (2008). 155. Collin.R.W. et al. Identification of a 2 Mb human ortholog of Drosophila eyes shut/spacemaker that is mutated in patients with retinitis pigmentosa. Am. J. Hum. Genet. 83, 594-603 (2008).

152 REFERENCE LIST

156. Hildebrandt.F. et al. A systematic approach to mapping recessive disease genes in individuals from outbred populations. PLoS. Genet. 5, e1000353 (2009). 157. Schuurs-Hoeijm akers.J.H. et al. Identification of recessive pathogenic alleles in small sibling families with intellectual disability. Submitted. 158. Schuurs-Hoeijm akers.J.H. et al. M utations in DDHD2, encoding an intracellular phospholipase A-i, cause a new recessive form of complex Hereditary Spastic Paraplegia. American Journal of Human Genetics. In press. 159. Schuurs-Hoeijm akers.J.H. et al. Recurrent de novo m utations in PACS1 oause defective cranial neural crest migration and define a new intellectual disability syndrome. American Journal of Human Genetics. In press. 160. Kalscheuer.V.M. et al. Draining the pond:14 novel candidate genes for X-linked intellectual disability. Submitted. 161. Haas.J., Katus.H.A., & Meder.B. Next-generation sequencing entering the clinical arena. Mol. Cell Probes 25, 206-211 (2011). 162. Kleefstra.T. et al. Disruption of an EHMT1-Associated Chromatin-Modification Module Causes Intellectual Disability. Am. J. Hum. Genet. 91, 73-82 (2012). 163. W akamatsu.N. et al. Mutations in SIP1, encoding Smad interacting protein-1, cause a form of Hirschsprung disease. Nat. Genet. 27. 369-370 (2001). 164. Wagenstaller.J. et al. Copy-number variations measured by single-nucleotide-polymorphism oligo- nucleotide arrays in patients with mental retardation. Am. J. Hum. Genet. 81, 768-779 (2007). 165. Williams,S.R. et al. Haploinsufficiency of HDAC4 causes brachydactyly mental retardation syndrome, with brachydactyly type E, developmental delays, and behavioral problems. Am. J. Hum. Genet. 87, 219-228 (2010). 166. Le.M.N. et al. MEF2C haploinsufficiency caused by either microdeletion of the 5q14.3 region or mutation is responsible for severe mental retardation with stereotypie movements, epilepsy and/or cerebral malformations. J. Med. Genet. 47, 22-29 (2010). 167. Imaizumi.K. et al. Sotos syndrome associated with ade novo balanced reciprocal translocation t{5;8) (q35;q24.1). Am. J. Med. Genet. 107 58-60 (2002). 168. Liaw.D. et al. Germline mutations of the PTEN gene in Cowden disease, an inherited breast and thyroid cancer syndrome. Nat. Genet. 16, 64-67 (1997). 169. Kishino,T., Lalande.M., & Wagstaff.J. UBE3A/E6-AP mutations cause Angelman syndrome. Nat. Genet. 15, 70-73 (1997). 170. Petrij.F. et al. Rubinstein-Taybi syndrome caused by mutations in the transcriptional co-activator CBP. Nature 376, 348-351 (1995). 171. Slager,R.E., Newton,T.L., Vlangos.C.N., Finucane.B., & Elsea,S.H. Mutations in RAM associated with Smith-Magenis syndrome. Nat. Genet. 33, 466-468 (2003). 172. Willard.H.F. &Riordan,J.R. Assignmentofthegenefor myelin proteolipid protein to the X chromosome: implications for X-linked myelin disorders. Science 230, 940-942 (1985). 173. Sirmaci.A. et al. MASP1 mutations in patients with facial, umbilical, coccygeal, and auditoryfindings of Carnevale, Malpuech, OSA, and Michels syndromes. Am. J. Hum. Genet. 87, 679-686 (2010). 174. Bilguvar.K. et al. Whole-exome sequencing identifies recessive WDR62 mutations in severe brain malformations. Nature 467, 207-210 (2010). 175. Doi.H. et al. Exome sequencing reveals a homozygous SYT14 mutation in adult-onset, autosomal-re- cessive spinocerebellar ataxia with psychomotor retardation. Am. J. Hum. Genet. 89, 320-327 (2011). 176. deG reef,J.C. et al. Mutations inZBTB24 are associated with immunodeficiency, centromericinstability, and facial anomalies syndrome type 2. Am. J. Hum. Genet. 88, 796-804 (2011). 177. Barak,T. et al. Recessive LAMC3 mutations cause malformations of occipital cortical development. Nat. Genet. 43, 590-594 (2011). 178. Bjursell.M.K. et al. Adenosine kinase deficiency disrupts the methionine cycleand causes hypermethi- oninemia, encephalopathy, and abnormal liver function. Am. J. Hum. Genet. 89, 507-515 (2011), 179. Clayton-Smith.J. et al. Whole-exome-sequencing identifies mutations in histone acetyltransferase gene KAT6B in individuals with the Say-Barber-Biesecker variant of Ohdo syndrome. Am. J. Hum. Genet. 89, 675-681 (2011).

153 REFERENCE LIST

180. Saitsu.H. et al. Mutations in POLR3A and POLR3B enooding RNA Polymerase III subunits cause an autosomal-recessive hypomyelinating leukoencephalopathy. Am. J. Hum. Genet. 89, 644-651 (2011), 181. Kalay.E. et al. CEP152 is a genome maintenance protein disrupted in Seckel syndrome. Nat. Genet. 43, 23-26 (2011). 182. Sirmaci.A. et al. Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual disability, skeletal malformations, and macrodontia. Am. J. Hum. Genet. 89, 289-294 (2011). 183. Sloan.J.L. et al. Exome sequencing identifies ACSF3 as a cause of combined malonic and methylmalonic aciduria. Nat. Genet. 43, 883-886 (2011). 184. Le,G.C. et al. Mutations at a single codon in Mad homology 2 domain of SMAD4 cause Myhre syndrome. Nat. Genet. 44, 85-88 (2012), 185. Shaheen.R. et al. Recessive mutations in DOCK6, encoding the guanidine nucleotide exchange factor DOCK6, lead to abnormal actin cytoskeleton organization and Adams-Oliver syndrome. Am. J. Hum. Genet. 89, 328-333 (2011). 186. Hanson.D. et al. Exome sequencing identifies CCDC8 mutations in 3-M syndrome, suggesting that CCDC8 contributes in a pathway with CUL7 and OBSL1 to control human growth. Am. J. Hum. Genet. 89, 148-153 (2011). 187. Jones.M.A, etal. DDOST mutations identified by whole-exome sequencing are implicated in congenital disorders of glycosylation. Am. J. Hum. Genet. 90, 363-368 (2012). 188. Flussain,M.S. et al. A truncating mutation of CEP135 causes primary microcephaly and disturbed centrosomal function. Am. J. Hum. Genet. 90, 871-878 (2012). 189. Gibson.W.T. etal. M utations in EZFI2 cause Weaver syndrome. Am. J. Hum. Genet. 90, 110-118 (2012). 190. Campeau.P.M. etal. Mutations in KAT6B, encoding a histone acetyltransferase, cause Genitopatellar syndrome. Am. J. Hum. Genet. 90, 282-289 (2012). 191. Simpson.M.A. et al. De novo mutations of the gene encoding the histone acetyltransferase KAT6B cause Genitopatellar syndrome. Am. J. Hum. Genet. 90, 290-294 (2012). 192. Ostergaard.P. etal. Mutations in KIF11 cause autosomal-dominant microcephaly variably associated with congenital lymphedema and chorioretinopathy. Am. J. Hum. Genet. 90, 356-362 (2012). 193. Jones.W.D. et al. De Novo Mutations in MLL Cause Wiedemann-Steiner Syndrome. Am. J. Hum. Genet. 91, 358-364 (2012). 194. Michot.C. et al. Exome sequencing identifies PDE4D mutations as another cause of acrodysostosis. Am. J. Hum. Genet. 90, 740-745 (2012). 195. Lee,H .etal. Exome sequencing identifies PDE4D mutations in acrodysostosis. Am. J. Hum. Genet. 90, 746-751 (2012). 196. Hood.R.L. etal. Mutations in SRCAP, encoding SNF2-related CREBBP activator protein, cause Float- ing-Harbor syndrome. Am. J. Hum. Genet. 90, 308-313 (2012). 197. Bochukova.E, et al. A mutation in the thyroid hormone receptor alpha gene. N. Engl. J. Med. 366, 243-249 (2012). 198. Ng,B.G. etal. Mutations in the glycosylphosphatidylinositol gene PIGL cause CFIIME syndrome. Am. J. Hum. Genet. 90, 685-688 (2012). 199. Lines,M.A. etal. Haploinsufficiency of a spliceosomal GTPaseencoded by EFTUD2 causes mandibu- lofacial dysostosis with microcephaly. Am. J. Hum. Genet. 90, 369-377 (2012). 200. Van Houdt,J.K. et al. Heterozygous missense mutations in SMARCA2 cause Nicolaides-Baraitser syndrome. Nat. Genet. 44, 445-9, S1 (2012). 201. Stevenson.R.E., Procopio-Allen,A.M., Schroer.R.J., & Collins,J.S. Genetic syndromes among individuals with mental retardation. Am. J. Med. Genet. A 123A, 29-32 (2003). 202. Mandei,J.L. & Chelly.J. Monogenic X-linked mental retardation: Is it as frequent as currently estimated? The paradox of the ARX (Aristaless X) mutations. European Journal of Human Genetics 12, 689-693 (2004). 203. Miller.S.A., Dykes,D.D., & Polesky.H.F. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 16, 1215 (1988). 204. Nannya.Y. et al. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Research 65, 6071-6079 (2005).

154 REFERENCE LIST

205. Purcell.S. et al. PLINK: A tooi set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559-575 (2007). 206. Curtis.D., Vine.A.E., & Knight.J. Study of regions of extended homozygosity provides a powerful methodto explore haplotype structure of human populations. Annals of Human Genetics 72, 261-278 (2008). 207. Gibson.J., Morton,N.E., & Collins.A. Extended tracts of homozygosity in outbred human populations. Human Molecular Genetics 15, 789-795 (2006). 208. Lencz,T. et al. Runs of homozygosity reveal highly penetrant recessice loei in schizophrenia. PNAS 104, 19942-19947. (2007). 209. Nalls.M.A. et al. Measures of Autozygosity in Decline: Globalization, Urbanization, and lts Implications for Medical Genetics. P/os Genetics 5, (2009), 210. Najmabadi.H. et al. Homozygosity mapping in consanguineous families reveals extreme heterogeneity of non-syndromic autosomal recessive mental retardation and identifies 8 novel gene loei. Hum.

Genet. 121, 43-48 (2007). 211. Collins,J.S., Marvelle.A.F., & Stevenson.R.E. Sibling recurrence in intellectual disability of unknown cause. Clin. Genet. 79, 498-500 (2011). 212. Turner,G. & Partington.M. Recurrence risks in undiagnosed mental retardation. J. Med. Genet. 37, E45 (2000). 213. Robinson.J T et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24-26 (2011). 214. Rozen,S. & SkaletskyH. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365-386 (2000). 215. Klambauer.G. et al. cn.MOPS: mixture of Poissons for discovering copy number variations in next- generation sequencing data with a low false discovery rate. Nucleic Acids Res. (2012). 216. Livak.K.J. & Schmittgen.T.D. Analysis of relative gene expression datausing real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25, 402-408 (2001). 217. Pfaffl.M.W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45 (2001). 218. Calabrese.R., Capriotti.E., Fariselli,P, Martelli.P.L., & Casadio.R. Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum. Mutat. 30. 1237-1244 (2009). 219. Adzhubei.l.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248-249 (2010). 220. Gilfillan.G.D. et al. SLC9A6 mutations cause X-linked mental retardation, microcephaly, epilepsy, and ataxia, a phenotype mimicking Angelman syndrome. Am. J. Hum. Genet. 82, 1003-1010 (2008). 221. Salomons.G.S. et al. X-linked creatine-transporter gene (SLC6A8) defect: a new creatine-deficiency syndrome. Am. J. Hum. Genet. 68, 1497-1500 (2001). 222. Clark,A.J. et al. X-linked creatine transporter (SLC6A8) mutations in about 1% of males with mental retardation of unknown etiology. Hum. Genet. 119, 604-610 (2006). 223. Rosenberg.E.H. et al. Functional characterization of missense variants in the creatine transporter gene (SLC6A8): improved diagnostic application. Hum. Mutat. 28, 890-896 (2007). 224. Inoue.H. et al. Roles of SAM and DDHD domains in mammalian intracellular phospholipase A1 KIAA0725p. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1823, 930-939 (2012). 225. Sato.S., Inoue.H., Kogure.T., Tagaya.M., & Tani.K. Golgi-localized KIAA0725p regulates membrane trafficking from the Golgi apparatus to the plasma membrane in mammalian cells. FEBS Lett. 584, 4389-4395 (2010). 226. Blackstone.C., O'Kane.C.J., & Reid.E. Hereditary spastic paraplegias: membrane traffic and the motor pathway. Nat. Rev. Neurosci. 12, 31-42 (2011). 227. Burgos,P.V. et al. Sorting of the Alzheimer’s disease amyloid precursor protein mediated by the AP-4 complex. Dev. Cell 18, 425-436 (2010). 228. Murmu.R.P. et al. Cellular distribution and subcellular localization of spatacsin and spastizin, two proteins involved in hereditary spastic paraplegia. Mol. Cell Neurosci. 47, 191-202 (2011). 229. Ng,D. et al. Oculofaciocardiodental and Lenz microphthalmia syndromes result from distinct classes of mutations in BCOR. Nat. Genet. 36, 411-416 (2004). 230. Stromme.P. et al. Mutations in the human ortholog of Aristaless cause X-linked mental retardation and epilepsy. Nat. Genet. 30, 441-445 (2002).

155 REFERENCE LIST

231. Amir.R.E. eta l. Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG- binding protein 2. Nat. Genet 23, 185-188 (1999). 232. Ariani.F. etal. FOXG1 is responsible for the congenital variant of Rett syndrome. Am. J. Hum. Genet. 83, 89-93 (2008). 233. lafrate.A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949-951 (2004). 234. Shmueli.O. et al. GeneNote: whole genome expression profiles in normal human tissues. C. R. B/o/. 326, 1067-1072 (2003). 235. Zhang.X. et al. Syne-1 and Syne-2 play crucial roles in myonuclear anchorage and motor neuron innervation. Development 134, 901-908 (2007). 236. Attali.R. et al. Mutation of SYNE-1, encoding an essential component of the nuclear lamina, is responsible for autosomal recessive arthrogryposis. Hum. Mol. Genet. 18, 3462-3469 (2009). 237. Thusberg.J., Olatubosun.A., & Vihinen.M. Performance of mutation pathogenicity prediction methods on missense variants. Hum. Mutat. 32, 358-368 (2011). 238. Cortes,L.F., El Khamisy.S.F., Zuma.M.C., Osborn.K., & Caldecott.K.W. A human 5'-tyrosyl DNA phos- phodiesterase that repairs topoisomerase-mediated DNA damage. Nature 461, 674-678 (2009), 239. Li,C„ Sun.S.Y., Khuri.F.R., & Li,R. Pleiotropic functions of EAPII/TTRAP/TDP2: cancer development, chemoresistance and beyond. Cell Cycle 10, 3274-3283 (2011). 240. Shoichet.S.A. etal. Mutations in the ZNF41 gene are associated with cognitive deficits: identification of a new candidate for X-linked mental retardation. Am. J. Hum. Genet. 73, 1341-1354 (2003). 241. Kleefstra,T. etal. Zincfinger 81 (ZNF81) mutations associated with X-linked mental retardation. J. Med. Genet. 41, 394-399 (2004). 242. Lugtenberg.D. etal. ZNF674: A new Kruppel-associated box-containing zinc-finger gene involved in nonsyndromic X-linked mental retardation. American Journal of Human Genetics 78, 265-278 (2006). 243. de.L.N. et al. UBE2A deficiency syndrome: Mild to severe intellectual disability accompanied by seizures, absent speech, urogenital, and skin anomalies in male patients. Am. J. Med. Genet. A 152A 3084-3090 (2010). 244. Kenwrick.S. ef al. Linkage studies of X-linked recessive spastic paraplegia using DNA probes. Hum. Genet. 73, 264-266 (1986). 245. Schule.R. & Schols.L. Genetics of hereditary spastic paraplegias. Semin. Neurol. 31, 484-493 (2011). 246. Simpson.M.A. etal. Maspardin is mutated in mast syndrome, a complicated form of hereditary spastic paraplegia associated with dementia. Am. J. Hum. Genet. 73, 1147-1156 (2003). 247. Orthmann-Murphy,J.L. et al. Hereditary spastic paraplegia is a novel phenotype for GJA12/GJC2 mutations. Brain 132, 426-438 (2009). 248. Bauer.P. et al. Mutation in the AP4B1 gene cause hereditary spastic paraplegia type 47 (SPG47). Neurogenetics. 13, 73-76 (2012). 249. Li,H. & Durbin.R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25, 1754-1760 (2009). 250. McKenna,A. etal. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-genera- tion DNA sequencing data. Genome Res. 20, 1297-1303 (2010). 251. Li.H. etal. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 25. 2078-2079 (2009). 252. Wang,K., Li,M., & Hakonarson.H. ANNOVAR: functional annotation of genetic variants from high- throughput sequencing data. Nucleic Acids Res. 38, e164 (2010). 253. AI-Yahyaee,S. et al. A novel locus for hereditary spastic paraplegia with thin corpus callosum and epilepsy. Neurology66, 1230-1234 (2006). 254. Alazami.A.M., AdlyN., AI.D.H., & Alkuraya,F.S. A nullimorphic ERLIN2 mutation defines a complicated hereditary spastic paraplegia locus (SPG18). Neurogenetics. 12, 333-336 (2011). 255. Yildirim.Y. et al. A frameshift mutation of ERLIN2 in recessive intellectual disability, motor dysfunction and multiple joint contractures. Hum. Mol. Genet. 20, 1886-1892 (2011). 256. Willemsen.M.A. et al. Clinical, biochemical and molecular genetic characteristics of 19 patients with the Sjogren-Larsson syndrome. Brain 124, 1426-1437 (2001). 257. Coene.K.L. et al. OFD1 is mutated in X-linked Joubert syndrome and interacts with LCA5-encoded lebercilin Am. J. Hum. Genet. 85, 465-481 (2009).

156 REFERENCE LIST

258. Wortmann.S.B. et al. Mutations in the phospholipid remodeling gene SERAC1 impair mitochondrial function and intracellular cholesterol trafficking and cause dystonia and deafness. Nat. Genet. 44, 797-802 (2012). 259. Nakajima.K. et al. A novel phospholipase A1 with sequence homology to a mammalian Sec23p-inter- acting protein, p125. J. Biol. Chem. 277, 11329-11335 (2002). 260. Tani.K., Mizoguchi.T., Iwamatsu.A., Hatsuzawa.K., &Tagaya,M. p125 is a novel mammalian Sec23p- interacting protein with structural similarity to phospholipid-modifying proteins, J. Biol. Chem. 274, 20505-20512 (1999). 261. Morikawa.R.K. et al. Intracellular phospholipase Algamma (iPLAIgamma) is a novel factor involved in coat protein complex I- and Rab6-independent retrograde transport between the endoplasmic reticulum and the Golgi complex. J. Biol. Chem. 284, 26620-26630 (2009). 262. Brand,A.H. & Perrimon.N. Targeted gene expression as a means of altering cell fates and generating dominant phenotypes. Development 118, 401-415 (1993). 263. Dietzl.G. et al. A genome-wide transgenic RNAi library for conditional gene inactivation in Drosophila. Nature 4 4 8 ,151-156 (2007). 264. Koh.Y.H., Gramates.L.S., &Budnik,V. Drosophilalarvalneuromuscularjunction: molecularcomponents and mechanisms underlying synaptic plasticity. Microsc. Res. Tech. 49, 14-25 (2000). 265. Bayat.V., Jaiswal.M., & Bellen,H.J. The BMP signaling pathway at the Drosophila neuromuscular junction and its links to neurodegenerative diseases. Curr. Opin. Neurobiol. 21, 182-188 (2011). 266. Liu.Z., Fluang,Y., Zhang.Y., Chen,D., & Zhang.Y.Q. Drosophila Acyl-CoA synthetase long-chain family member 4 regulates axonal transport of synaptic vesicles and is required for synaptic development and transmission. J. Neurosci. 31, 2052-2063 (2011). 267. Schenck,A. et al. CYFIP/Sra-1 Controls neuronal connectivity in Drosophila and links the Rac1 GTPase pathway to the fragile X protein. Neuron 38, 887-898 (2003). 268. Zweier,C. et al. CNTNAP2 and NRXN1 are mutated in autosomal-recessive Pitt-Hopkins-like mental retardation and determine the level of a common synaptic protein in Drosophila. Am. J. Hum. Genet. 85, 655-666 (2009). 269. Wichmann.C. & Sigrist.S.J. The active zone T-bar-a plasticity module? J. Neurogenet. 24, 133-145 (2010). 270. Dreha-Kulaczewski,S. et al. Cerebral metabolic and structural alterations in hereditary spastic paraplegia with thin corpus callosum assessed by MRS and DTI. Neuroradiology 48, 893-898 (2006). 271. Hobson,G.M. & Garbern.J.Y. Pelizaeus-Merzbacher disease, Pelizaeus-Merzbacher-like disease 1, and related hypomyelinating disorders. Semin. Neurol. 32, 62-67 (2012). 272. Stromillo.M.L. et al. Structural and metabolic damage in brains of patients with SPG11-related spastic paraplegia as detected by quantitative MRI. J. Neurol. 258, 2240-2247 (2011). 273. Sjostrand.F.S. A method to improve contrast in high resolution electron microscopy of ultrathin tissue sections. Exp. Cell Res. 10, 657-664 (1956). 274. Reynold.E.S. The use of lead nitrate at high pH as an electron opaque stain in electron microscopy. The Journal of Cell Biology. 17, 208. (1965). 275. Karnovsky,M.J. A formaldehyde-glutaraldehyde fixative of high osmolarity for use in electron microscopy. The Journal o f Cell Biology. 27, 137A. (1965). 276. Simmen,T. et al. PACS-2 Controls endoplasmic reticulum-mitochondria communication and Bid-mediated apoptosis. EMBOJ. 24, 717-729 (2005). 277. Youker.R.T., Shinde.U., Day,R., & Thomas,G. At the crossroads of homoeostasis and disease: roles of the PACS proteins in membrane traffic and apoptosis. Biochem. J. 42 1,1-15 (2009). 278. Schermer,B. et al. Phosphorylation by casein kinase 2 induces PACS-1 binding of nephrocystin and targeting to cilia. EMBO J. 24, 4415-4424 (2005). 279. Wan,L. et al. PACS-1 defines a novel gene family of cytosolic sorting proteins required for trans-GoIgi network localization. Cell 94, 205-216 (1998). 280. Scott.G.K. et al. The phosphorylation state of an autoregulatory domain Controls PACS-1-directed protein traffic. EMBOJ. 22, 6234-6244 (2003). 281. Scott.G.K., Fei.H., Thomas,L„ Medigeshi,G.R., & Thomas,G. A PACS-1, GGA3 and CK2 complex regulates CI-MPR trafficking. EMBO J. 25, 4423-4435 (2006),

157 REFERENCE LIST

282. Cordero,D.R. et al. Cranial neural crest cells on the move: their roles in craniofacial development. Am. J. Med. Genet. A 155A, 270-279 (2011). 283. Sauka-Spengler,T. & Bronner-Fraser.M. Agene regulatory network orchestrates neural crest formation. Nat. Rev. Mol. Cell Biol. 9, 557-568 (2008). 284. Vaglia.J.L. & Hall.B.K. Regulation of neural crest cell populations: occurrence, distribution and underlying mechanisms. Int. J. Dev. Biol. 43, 95-110 (1999). 285. Fiorio.P.A. et al. TRPV4 mediates tumor-derived endothelial cell migration via arachidonic acid-activated actin remodeling. Oncogene 31, 200-212 (2012). 286. Holzer.P. Transient receptor potential (TRP) channels as drug targets for diseases of the digestive system. Pharmacol. Ther. 131, 142-170 (2011). 287. Shin.S.H. et al. Phosphorylation on the Ser 824 residue of TRPV4 prefers to bind with F-actin than with microtubules to expand the cell surface area. Cell Signal. 24, 641-651 (2012). 288. Ahmed.M.K., Takumida,M., Ishibashi.T., Hamamoto.T., & Hirakawa.K. Expression of transient receptor potential vanilloid (TRPV) families 1, 2, 3 and 4 in the mouse olfactory epithelium. Rhinology 47, 242-247 (2009). 289. Jenkins.P.M., Zhang,L., Thomas,G., & Martens,J.R. PACS-1 mediates phosphorylation-dependent ciliary trafficking of the cyclic-nucleotide-gated channel in olfactory sensory neurons. J. Neurosci. 29, 10541-10551 (2009). 290. Liedtke.W., Tobin.D.M., Bargmann.C.I., & Friedman.J.M. Mammalian TRPV4 (VR-OAC) directs behavioral responses to osmotic and mechanical stimuli in Caenorhabditis elegans. Proc. Nati Acad. Sci. U. S. A 100 Suppl 2, 14531-14536 (2003). 291. van Bokhoven,H. Genetic and epigenetic networks in intellectual disabilities. Annu. Rev. Genet. 45, 81-104 (2011). 292. Schuurs-Hoeijmakers.J.H. et al. Homozygosity mapping in outbred families with mental retardation. Eur. J. Hum. Genet. 19, 597-601 (2011). 293. Roach.J.C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636-639 (2010). 294. Kondrashov.A.S. Direct estimates of human per nucleotide mutation rates at 20 loei causing Mendelian diseases. Hum. Mutat. 2 1 ,12-27 (2003). 295. Vogel,F, & Rathenberg,R. Spontaneous mutation in man. Adv. Hum. Genet. 5, 223-318 (1975). 296. Willemsen.M.H. et al. Mutations in DYNC1H1 cause severe intellectual disability with neuronal migration defects. J. Med. Genet. 49, 179-183 (2012). 297. Feenstra.l. et al. European Cytogeneticists Association Register of Unbalanced Chromosome Aberrations (ECARUCA); an online database for rare chromosome abnormalities. European Journal of Medical Genetics 49, 279-291 (2006). 298. Firth.H.V. et al. DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am. J. Hum. Genet. 84, 524-533 (2009). 299. Mariani.J. et al. Modeling human cortical development in vitro using induced pluripotent stem cells. Proc. Nati Acad. Sci. U. S. A 109, 12770-12775 (2012). 300. Kuo,Y.C. & Wang.C.T. Neuronal differentiation of induced pluripotent stem cells in hybrid polyester scaffolds with heparinized surface. Colloids Surf. B Biointerfaces. 100, 9-15 (2012). 301. Thoma,E.C. et al. Ectopic Expression of Neurogenin 2 Alone is Sufficiënt to Induce Differentiation of Embryonic Stem Cells into Mature Neurons. PLoS. One. 7, e38651 (2012). 302. Chambers,S.M. et al. Combined small-molecule inhibition accelerates developmental timing and converts human pluripotent stem cells into nociceptors. Nat. Biotechnol. 30(7), 715-20 (2012). 303. Naegele.J.R., Vemuri.M.C., & Studer.L. Embryonic Stem Cell Therapy for Intractable Epilepsy. Mechanisms of the Epilepsies 4th Edition (2012). 304. Bardy,J. et al. Microcarrier suspension cultures for high density expansion and differentiation of human pluripotent stem cells to neural progenitor cells. Tissue Eng Part C. Methods [Epub ahead of print] (2012). 305. Cooper,G.M. et al. A copy number variation morbidity map of developmental delay. Nat. Genet. 43, 838-846 (2011).

158 REFERENCE LIST

306. Botstein,D. & Risch.N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33 Suppl, 228-237 (2003). 307. Johnston,J.J, et a i Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes. Am. J. Hum. Genet. 91, 97-108 (2012). 308. Kitzman.J.O. et al. Noninvasive whole-genome sequencing of a human fetus. Sci. Transl. Med. 4, 137ra76 (2012). 309. Wetmore.D.Z. & Garner.C.C. Emerging pharmacotherapies for neurodevelopmental disorders. J. Dev. Behav. Pediatr. 31, 564-581 (2010). 310. Kramer, J.M. et al. Epigenetic regulation of learning and memory by Drosophila EFIMT/G9a. PLoS. Biol. 9, e1000569 (2011).

159

DANKWOORD | ACKNOWLEDGEMENTS

De onderzoekers van de vrijdag bespreking: dank voor jullie input. Bregje, bedankt voor al je hulp en tips in het begin van mijn project. Jij hebt samen met Bert het siblingproject opgestart. Ook bedankt dat ik altijd stoom bij je af kon komen blazen. Ik wens je een goede zwangerschap en in 2014 een mooi jaar in Australië. David, altijd rustig en evenwichtig tijdens de werkbespreking en bereid om mee te denken. Daniëlle, ik ben erg benieuwd naar de eerste exoomresultaten van je CVI cohort. En Anneke, mijn kamergenoot en klankbord sinds de geboorte van Thijs. Bedankt voor al je hulp bij het aanleggen van een database en het verzamelen van DNA en informed consent voor de sibling families. Heerlijk om een kamergenoot te hebben die bereid is mee te denken tijdens het schrijven van de hoofdstukken van mijn proefschrift. En wat fijn dat je mijn paranimf bent.

Willy en Erik-Jan, bedankt voor al het sequentie werk dat jullie hebben verricht.

Lisenka, Konny, Alex, Joris en de Genomic Disorders groep: bedankt voor het opzetten en uitvoeren van alle exoomexperimenten en voor jullie enthousiasme, fijne samenwerking en hulp bij de data analyse.

Christian en Jane, dank voor jullie bio-informatische ondersteuning voor van alles en nog wat.

Rolph, Nicole en de array groep: jullie wil ik bedanken voor het uitvoeren van alle array experimenten en voor het bespreken van interessante bevindingen tijdens de donderdag bespreking.

Saskia en de dames van de celkweek, bedankt voor al jullie goede zorgen bij het opkweken van patientencellen.

De sectie klinische genetica wil ik bedanken voor hun bereidheid mee te werken aan onderzoeksprojecten met name Tjitske met haar kennis van ID. Ineke, bedankt voor je flexibiliteit en de mij geboden mogelijkheid om een 'alternatieve' start met de opleiding te maken. Het resultaat heb je nu in handen! Ik kijk er naar uit om vanaf februari 2013 een start te maken in de kliniek en echt deel uit te gaan maken van het arts-assistententeam.

Annette en Tjitske, dank voor het gebruik van jullie met zorg samengestelde ID genenlijst voor het maken van de diagrammen uit hoofstuk 1 van dit proefschrift.

Bonnie en de vliegengroep: bedankt voor het vliegensvlugge vliegenwerk, een waardevolle aanvulling op het klinische deel van hoofdstuk 4!

163 DANKWOORD I ACKNOWLEDGEMENTS

Tijdens mijn promotietijd heb ik verschillende studenten mogen begeleiden en wegwijs gemaakt in het lab. In het bijzonder wil ik Ilse en Gausiya bedanken voor hun grote inzet en bijdrage aan de hoofdstukken 3 en 4 van dit proefschrift. Het was erg leuk met jullie samen te werken. Ik ben jullie nog een etentje verschuldigd.

Thanks to all collaborators from outside the Human Genetics Department. Especially to the European microdeletion network: my gratitude towards Dr. Romano, Prof. Kooy, Prof. Zuffardi, Dr. Knight and Prof. Nordenskjöld for the ongoing collaboration and useful yearly meetings. Thanks to Prof. Caldecott and his group for ongoing collaboration that was initiated by findings discussed in Chapter 3 and that is not yet part of this thesis. Also special thanks to I. Otte-Höller and Prof. M.Lammens from the department of Pathology in Nijmegen, and to M.T. Geraghty, S. Ben-Salem and Prof. L. Al-Gazali our collaborators from Canada and Oman for their work and efforts on families with mutations in DDHD2. Lastly, special thanks to Prof. Katsanis and Dr. Oh for their efforts in studying PACS1 mutant zebrafish embryos (Chapter 5).

Een aantal vrienden en vriendinnen wil ik ook in het bijzonder noemen: Ymeen & Roelant, Corine & René, Marije & Sander en Jasper & Kim. De afgelopen driekwart jaar hebben we elkaar niet zo vaak gezien als ik en Klaas Jan graag hadden gewild. Bedankt voor jullie begrip, lieve kaartjes en bemoedigende woorden tijdens de laatste loodjes van het schrijven.

Lieve opa en oma van Dommelen en oma Hoeijmakers, wat voel ik me een geluksvogel dat jullie bij mijn promotie aanwezig kunnen zijn!

Lieve Mathijs & Jane, Wieteke & Jelle, Marieke & Jeroen en Lineke & Harald, wat lief dat jullie zo met mij, Klaas Jan en de twee kleine jongens mee leven. Wieteke, Jelle en Marieke, bedankt voor al jullie hulp de laatste anderhalf jaar! Wieteke, ik vind het super dat ik één van mijn zussen als paranimf aan mijn zijde heb staan. We hebben het zeker niet de hele tijd over werk als we elkaar zien, maar ik vond het heel fijn om het met je over promoveren en de onderzoekscultuur met alle ups en downs te kunnen praten.

Lieve Carla, wat fijn dat je eens in de 14 dagen de zorg voor Thijs -en nu ook Thomas- op je neemt, zodat Klaas Jan en ik met een gerust hart naar werk toe kunnen. Heerlijk ook om te zien hoe gezellig Thijs en jij het op die maandagen hebben.

164 DANKWOORD | ACKNOWLEDGEMENTS

Lieve mam en lieve pap, heel erg bedankt voor al jullie goede zorgen en de mogelijkheden die jullie mij van jongs af aan geboden hebben. Jullie gaven me de ruimte mijn eigen weg te kiezen met jullie onvoorwaardelijke steun bij mijn keuzes. Lieve mam, je hebt het afgelopen jaar zo vaak bij gesprongen op mijn vrije dagen, deed van alles in het huishouden en ging met Thijs op stap, zodat ik mijn handen vrij had om weer eens achter de computer te duiken. En ook jij, lieve pap, hebt het afgelopen jaar vaak Klaas Jan helpen klussen op de zolder om mij werk uit handen te nemen. Geweldig om te zien hoe jullie van Thijs en Thomas genieten en vice versa. Wat fijn dat jullie voor ons klaar staan. We houden van jullie. Bedankt voor al jullie hulp!

En dan 'save the best for last': mijn allerliefste Klaas Jan, lieve Thijs en lieve kleine Thomas, door jullie realiseer ik me elke dag dat er belangrijkere dingen in het leven zijn dan werk. Ik geniet van de momenten dat we samen thuis of met zijn allen op stap zijn. Jullie geven me zoveel energie en maken mij gelukkig. Klaas Jan, jij hebt me telkens weer gemotiveerd en gesteund om dit proefschrift tot een goed einde te brengen, ook al betekende dit dat ik minder tijd voor jou en Thijs had. Dankjewel voor jullie geduld bij het voltooien van dit proefschrift.

Ik heb gezegd.

165

CURRICULUM VITAE

Curriculum Vitae

Johanna (Janneke) Hendrica Maria Schuurs-Hoeijmakers werd op 26 maart 1982 geboren te Zevenhuizen (ZH) als tweede uit een gezin van vijf kinderen. Zij volgde middelbare onderwijs op het Coenecoop College te Waddinxveen waar zij in 2000 haar VWO diplom a (cum laude) behaalde. Aansluitend studeerde ze geneeskunde aan de universiteit Leiden. Naast haar studie ontwikkelde zij een passie voor de roeisport. Dit resulterende in twee jaar deelname als roeister aan de dames wed- strijdsectie van A.L.S.R.V. Asopos de Vliet gevolgd door drie jaar intensieve coaching van de volgende generaties wedstrijddames. Tijdens haar keuze co-schap van de studie geneeskunde liep zij stage op de afdeling Klinische Genetica in het Leids Universitair Medisch Centrum met aansluitend een wetenschappelijke stage binnen de sectie Cytogenetica van hetzelfde ziekenhuis met als onderwerp ‘moleculaire karyotypering met behulp van array CGH bij individuen met een verstandelijke beperking en/of multipele congenitale afwijkingen'. In februari 2008 behaalde zij haar artsexamen. Van januari 2008 tot juni 2008 was ze werkzaam als ANIOS binnen de sectie Klinische Genetica van het Leids Universitair Medisch Centrum. In juni 2008 startte zij haar promotietraject binnen de sectie Klinische Genetica van het UMC St. Radboud op het onderzoeksproject 'Genen en Mentale Retardatie', gesubsidieerd door ZonMW (ViDi subsidie voor Dr. B.B.A. de Vries) met als promotoren Prof.Dr. H.G. Brunner en Prof.Dr. H. van Bokhoven en als co-promotoren Dr.B.B.A. de Vries en Dr. A.P.M. de Brouwer. Janneke verrichtte onderzoek naar de genetische oorzaken van verstandelijke beperking met als doel het identificeren van mutaties die leiden tot verstandelijke beperking in individuen waarbij routine diagnostisch onderzoek geen moleculaire diagnose heeft opgeleverd. Speciale aandacht binnen dit project ging uit naar niet-consanguïne families met broers en/of zussen die beiden een verstandelijke beperking hebben. Onderzoeks­ bevindingen werden gepresenteerd op verschillende internationale congressen, waaronder de '12th international congress of human genetics' in Montréal, Canada in 2011. Janneke ontving daar een ‘Young Investigator Travel Award’ voor haar presentatie 'Massive parallel sequencing of all human protein-coding genes identifies PACS1 as a new gene for a new intellectual disability syndrome'. Sinds mei 2012 is ze gestart met haar opleiding tot klinisch geneticus in het UMC St. Radboud te Nijmegen. Ze is getrouwd met Klaas Jan Schuurs en heeft twee zoons: Thijs en Thomas.

167 CURRICULUM VITAE

Curriculum Vitae

Johanna (Janneke) Hendrica Maria Schuurs-Hoeijmakers was born on March 26, 1982 in Zevenhuizen, the Netherlands, as the second child in a family of five. In 2000 she passed her secondary education (with a cum laude degree) at the Coenecoop College in Waddinxveen. In the same year she started studying Medicine at Leiden University. Next to her studies she developed a passion for the rowing sports and was part of the women racing team of the A.L.S.R.V. Asopos de Vliet for two years after which she exchanged her position in the boat for a position as coach for the women racing teams for several years. During her study she did an internship at the Clinical Genetics Department at the Leiden University Medical Centre, subsequently followed by her scientific internship in the Cytogenetics Department of the same hospital. The subject of this internship was ‘molecular karyotyping by array CGH in invidivuals with intellectual disability and/or congenital anomalies’. She obtained her medical degree in Februari 2008. From January till June of the same year, she worked as a medical trainee at the Clinical Genetics Department at the Leiden University Medical Centre. From June 2008 till May 2012 she worked as a PhD-student at the department of Human Genetics at the Radboud University Nijmegen Medical Centre (Promotoren: Prof. H.G. Brunner and Prof. H. van Bokhoven and co-promotoren: Dr.B.B.A. de Vries and Dr. A.P.M. de Brouwer). Her PhD-project was part of the research project ‘Genes and Mental Retardation’ which was funded by ZonMW (ViDi to Dr. B.B.A. de Vries, MD). The aim of her research was to identify mutations in individuals with intellectual disability in whom the routine diagnostic work-up does not provide a molecular basis for this disorder. There was a special focus on non-consanguineous families with multiple affected siblings. Results were presented during several international meetings amongst which the '12th international congress of human genetics' in Montréal, Canada, 2011. Here, Janneke received an Young Investigator Travel Award for her presentation 'Massive parallel sequencing of all human protein-coding genes identifies PACS1 as a new gene for a new intellectual disability syndrome'. In May 2012 Janneke started her training in Clinical Genetics in Nijmegen. She is married to Klaas Jan Schuurs and has two sons: Thijs and Thomas.

1 6 8 LIST OF PUBLICATIONS

List of pubücations

Identification of recessive pathogenic alleles in small sibling families with intellectual disability. Schuurs-Hoeijmakers JH, Vulto-van Silfhout AT, Vissers LE, van de Vondervoort II, van Bon BW, de Ligt J, Gilissen C, Hehir-Kwa JY, Neveling K, del Rosario M, Hira G, Reitano S, Vitello A, Failla P, Greco D, Fichera M, Galesi 0, Kleefstra T, Greally MT, Ockeloen CW, Willemsen MH, Bongers EM, Janssen IM, Pfundt R, Veltman JA, Romano C, Willemsen MA, van Bokhoven H, Brunner HG, de Vries BB, de Brouwer AP. Submitted.

Mutations in DDHD2, encoding an intracellular phospholipase A-i, cause a recessive form of complex Hereditary Spastic Paraplegia. Schuurs-Hoeijmakers JH*, Geraghty MT*, Kamsteeg EJ*, Ben-Salem S, de Bot ST, Nijhof B, van de Vondervoort II, van der Graaf M, Castells Nobau A, Otte-Höller 1, Vermeer S, Smith AC, Humphreys P, Schwartzentruber J, FORGE Canada Consortium, Ali BR, Al-Yahyaee SA, Tariq S, Pramathan T, Bayoumi R, Kremer HP, van de Warrenburg BP, van den Akker WM, Gilissen C, Veltman JA, Janssen IM, Vulto-van Silfhout AT, van der Velde-Visser S, Lefeber DJ, Diekstra A, Erasmus CE, Willemsen MA, Vissers LE, Lammens M, van Bokhoven H, Brunner HG, Wevers RA, Schenck A, Al-Gazali L#, de Vries BB#, de Brouwer AP. Am. J. Hum. Genet. In press. (* equal first author contribution, * equal last author contribution).

Recurrent de novo mutations in PACS1 cause defective cranial neural crest migration and define a recognizable intellectual disability syndrome. Schuurs-Hoeijmakers JH*, Oh EC*, Vissers LE»*, Swinkels ME, Gilissen C, Willemsen MA, Holvoet M, Steehouwer M, Veltman JA, de Vries BB, van Bokhoven H, de Brouwer AP, Katsanis N#, Devriendt K# , Brunner HG. Am. J. Hum. Genet. In press. (* equal first author contribution, # equal last author contribution).

Mutations in the phospholipid remodeling gene SERAC1 impair mitochondrial function and intracellular cholesterol trafficking and cause dystonia and deafness. Wortmann SB, Vaz FM, Gardeitchik T, Vissers LE, Renkema GH, Schuurs-Hoeij­ makers JH, Kulik W, Lammens M, Christin C, Kluijtmans LA, Rodenburg RJ, Nijtmans LG, Grünewald A, Klein C, Gerhold JM, Kozicz T, van Hasselt PM, Harakalova M, Kloosterman W, Bari I, Pronicka E, Ucar SK, Naess K, Singhal KK, Krumina Z, Gilissen C, van Bokhoven H, Veltman JA, Smeitink JA, Lefeber DJ, Spelbrink JN, Wevers RA, Morava E, de Brouwer AP. Nat. Genet. 44(7), 797-802 (2012). PMID: 22683713.

169 LIST OF PUBLICATIONS

Autosomal recessive dilated cardiomyopathy due to DOLK mutations results from abnormal dystroglycan O-mannosylation. Lefeber DJ, de Brouwer AP, Morava E, Riemersma M, Schuurs-Hoeijmakers JH, Absmanner B, Verrijp K, van den Akker WM, Huijben K, Steenbergen G, van Reeuwijk J, Jozwiak A, Zucker N, Lorber A, Lammens M, Knopf C, van Bokhoven H, Grünewald S, Lehle L, Kapusta L, Mandei H, Wevers RA. PLoS Genet. 7(12), e1002427 (2011).PMID: 22242004.

C140RF179 encoding IFT43 is mutated in Sensenbrenner syndrome. Arts HH, Bongers EM, Mans DA, van Beersum SE, Oud MM, Bolat E, Spruijt L, Cornelissen EA, Schuurs-Hoeijmakers JH, de Leeuw N, Cormier-Daire V, Brunner HG, Knoers NV, Roepman R. J. Med. Genet.48(6), 390-5 (2011). PMID: 21378380.

Homozygosity mapping in outbred families with mental retardation. Schuurs-Hoeijmakers JH, Hehir-Kwa JY, Pfundt R, van Bon BW, de Leeuw N, Kleefstra T, Willemsen MA, van Kessel AG, Brunner HG, Veltman JA, van Bokhoven H, de Brouwer AP, de Vries BB. Eur. J. Hum. Genet. 19(5),597-601 (2011). PMID: 21248743.

X-chromosome duplications in males with mental retardation: pathogenic or benign variants? Gijsbers AC, den Hollander NS, Helderman-van de Enden AT, Schuurs-Hoeijma­ kers JH, Vijfhuizen L, Bijlsma EK, van Haeringen A, Hansson KB, Bakker E, Breuning MH, Ruivenkamp CA. Clin. Genet. 79(1),71-8 (2011). PMID: 20486941.

Recurrent deletion of ZNF630 at Xp11.23 is not associated with menta! retardation. Lugtenberg D, Zangrande-Vieira L, Kirchhoff M, Whibley AC, Oudakker AR, Kjaergaard S, Vianna-Morgante AM, Kleefstra T, Ruiter M, Jehee FS, Ullmann R, Schwartz CE, Stratton M, Raymond FL, Veltman JA, Vrijenhoek T, Pfundt R, Schuurs-Hoeijmakers JH, Hehir-Kwa JY, Froyen G, ChellyJ, Ropers HH, Moraine C, Gècz J, Knijnenburg J, Kant SG, Hamel BC, Rosenberg C, van Bokhoven H, de Brouwer AP. Am. J Med Genet A. 2010 Mar;152A(3):638-45. PMID: 20186789.

Refining the critical region of the novel 19q13.11 microdeletion syndrome to 750 Kb. Schuurs-Hoeijmakers JH, Vermeer S, van Bon BW, Pfundt R, Marcelis C, de Brouwer AP, de Leeuw N, de Vries BB. J. Med. Genet. 46(6),421-3 (2009). PMID: 19487540.

170 LIST OF PUBLICATIONS

A new diagnostic workflow for patients with mental retardation and/or multiple congenital abnormalities: test arrays first. Gijsbers AC, Lew JY, Bosch CA, Schuurs-Hoeijmakers JH, van Haeringen A, den Hollander NS, Kant SG, Bijlsma EK, Breuning MH, Bakker E, Ruivenkamp CA. Eur. J. Hum. Genet. 17(11),1394-402. (2009). PMID: 19436329.

Extending the phenotype of recurrent rearrangements of 16p11.2: deletions in mentally retarded patients without autism and in normal individuals. Bijlsma EK, Gijsbers AC, Schuurs-Hoeijmakers JH, van Haeringen A, Fransen van de Putte DE, Anderlid BM, Lundin J, Lapunzina P, Pérez Jurado LA, Delle Chiaie B, Loeys B, Menten B, Oostra A, Verhelst H, Amor DJ, Bruno DL, van Essen AJ, Hordijk R, Sikkema-Raddatz B, Verbruggen KT, Jongmans MC, Pfundt R, Reeser HM, Breuning MH, Ruivenkamp CA. Eur. J. Med. Genet. 52(2-3),77-87 (2009). PMID: 19306953.

171

LIST OF ABBREVIATIONS

List of abbreviations

A Adenine AAIDD American Association on Intellectual and Developmental Disabilities AD autosomal dominant ADFID attention deficit hyperactivity disorder AR autosomal recessive ARID autosomal recessive intellectual disability ARMR autosomal recessive mental retardation bp base pair Bpr bruchpilot C Cytosine CGFI comparative genomic hybridization CH cycloheximide Cho choline CNCC cranial neural crest cells CNS centra! nervous system CNV copy number variation Cr creatine Ct threshold cycles CT computer tomography DD developmental delay DECIPHER DatabasE of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources DGV Database of Genomic Variation Dlg1 disc large 1 DNA desoxyribonucleïnezuur DSM Diagnostic and Statistical Manual of Mental Disorders EBV-LCLs Epstein-Barr-virus transformed lymphoblastoid cell lines ECARUCA European Cytogeneticists Association Register of Unbalanced Chromosomal Aberrations EEG electroencephalogram EM electron microscopy ENCC enteric neural crest cells FBR furin cargo binding region FISH fluorescent in situ hybridization G Guanine GFP green fluorescent protein HMM hidden Markov model HSP hereditary spastic paraplegia ICD International Statistical Classification of Diseases and Related Health Problems ID intellectual disability iPLA-i intracellular phospholipases Ai IQ intelligence quotiënt kb kilobase (thousand base pairs) KRAB Kruppel-associated box LCR low copy repeat Mb megabase (million base pairs) MLPA multiplex ligation dependent probe amplification MPS massive parallel sequencing MR mental retardation

173 LIST OF ABBREVIATIONS

MRI magnetic resonance imaging mRNA messenger ribonucleic acid NAA N-acetylaspartate NA not analyzed/ not available NMD nonsense-mediated RNA decay NMJ neuromuscular junction OFC ocoipitofrontal circumference OMIM Online Medelian Inheritance in Man database ORF open reading frame PA phosphatidic acid PCR polymerase chain reaction Proton MR spectroscopy proton magnetic resonance spectroscopy PWMH periventricular white matter hyperintensities QPCR quantitative polymerase chain reaction RNA ribonucleic acid R N Aj inducible R N A interference RO H regions of homozygosity SAM sterile alpha motif SD Standard deviation SNP single nucleotide polymorphism SPG spastic paraplegia SROH shared regions of homozygosity T Thymine UCSC University of California, Santa Cruz UTR untranscribed region VB verstandelijke beperking WT wildtype XL X-linked ZonMW Nederlandse organisatie voor gezondheidsonderzoek en zorginnovatie

174

COLOR FIGURES

Color figures

Figure 2 Distribution of inheritance mode of 472 human genes involved in ID phenotypes (chapter 1, page 13)

Genes are derived from the Online Mendelian Inheritance in Man14 (July 2012) and have ID reported as phenotypic feature in the 'clinical synopsis' or 'clinical features’ or 'description' section, respectively.

Figure 3 Increasing resolution of techniques applied for genome wide investigation of individuals with intellectual disability and their approximate cumulative diagnostic yield (in percentage) (chapter 1, page 15)

G-banded Array CGH Exome karyotyping sequencing

)) }) ii li H IJ ïi V» 11 >! {{ li u ii !! Si n ' “ If

5-10 Mb 20 kb 1 Single nucleotide level 1970 2000 10-15%* 30% 2010 65%** 2012

‘ This percentage includes targeted techniques such as FISH karyotyping and subtelomeric MLPA. ** For exome sequencing, recent studies indicate a yield of up to -65% when applied as first diagnostic test, depending on inheritance model studied106’ 11°’ 115’134’ 157 ieo

177 COLOR FIGURES

Figure 4 Schematic representation of massive parallel sequencing principles based on library preparation using array-based and in-solution capture of target DNA (chapter 1, page 20)

Genomic DNA

Fragmentation End repair Adapter ligation

First, genomic DNA is isolated and sheared in fragments, sequencing adapters for multiplexing are attached. Selection of target sequences is achieved by hybridization against I target-specific oligonucieotide micro- Read mapping, alignment, variant calling arrays or in-solution capture probes, the latter contain magnetic beads and are retained by a magnetic field. Wash- ing removes unbound fragments and the generated library is multiplexed by PCR reaction and fragments are sequenced simultaneously (massive ‘parallel’ sequencing). Alignment of T T a t a r. r. g a r . t c t g a a the sequence reads to the human reference genome and variant calling 1 allow for variant prioritization and inter­ Variant prioritization pretation. Adapted from Haas et al. and biological interpretation 2011161.

178 COLOR FIGURES

Figure 5 Massive parallel sequencing (MPS) approaches to identify mutations causing ID (chapter 7, page 21)

Targetted MPS

r ó Ó É É li li óóèüèÉüÉ W | | ftl ftft H n I I H / * Homozygous region m i m

Genome wide MPS

□ tO DtODtODtODtO EhrO

• ■ mé i i T i li ii IH Two alleles Family-based Intersection filtering exome in the same gene

de novo variants Shared variants Recessive variants

(A) Homozygosity mapping followed by fargeted MPS of the homozygous linkage intervals, as described by Najmabadi et a/.110. (B) X-chromosome exome sequencing: sequencing of all coding exons of the X-chromosome in families with an apparent X-linked inheritance pattern, followed by segregation testing. This approach was taken by Kalschauer et al.'60. (C) Family-based exome sequencing to identify de novo rare variants in sporadic cases with non-syndromic ID/ autism, applied on a large scale by several studies

115, 131-134 (D) Exome sequencing in syndromic ID to identify rare variants in the same gene, genes with a similar function or functioning in the same protein complex. This approach lead to identification of many dominant syndromic ID genes109' 11>' 113' 'H 123' 162 (E) Exome sequencing to identify rare potentially recessive variants (homozygous and compound heterozygous) in siblings with (non)-syndromic ID. Applied in a systematic screen of twenty sibling families by Schuurs-Hoeijmakers et a/.157’15S.

179 COIOR FIGURES

Figure 6 Total number of genes involved in ID phenotypes and identified from 1980-2012 (chapter 1, page 26)

*For 2012, this is an expected total number based on the first six months of 2012. Colors represent the different inheritance modes: gray for autosomal recessive ID genes (AR), turquoise for autosomal dominant genes (AD) and purple for X-chromosomal genes (X-linked).

180 COLOR FIGURES

Figure 2 SROHs showing overlap with MRT7-10 (http://genome.ucsc.edu/, hg18) (chapter 2, page 43)

flR M R i M RT1 1 —lO C U S MRT11 Chromosome Bands Locarized by Fli H Mapping Clones 1 9 q l3 .11 I 9 q l3 .3 l ISCI13.12 19ql3.2 1*913.32 RefSeq Genes

'II • : s a, ■•■■■■ *jj

1 ;

CEACflM7 | R T F 1R 3 1 ERF | CEflCflMS | PSG1ÖB FSG5 fl LYPD3 | IRGC | ZNF23Ö » ZFF112(|) C EflCflM 5 | GRIK5 (JU CIC) FSG3 | PSG11 | PSG9 | XRCC1 ZNF464 || ZNF226 | ZNF180 |j CEFlC'PlM6 | 2 N F 5 7 4 \ MEGFS PSG8 | PSG2 | IRGQ | ZNF45 | ZNF227 CEflCftM3 W D EDD2H PSG8 j :G5 H PHLDB3 fj KCNN4 | ZHF284 | ZNF229 ft LY P D 4 | Z N F 5 2 6 4 FSG 8 | PSG4 | ETHEi H LYPD5H ZNF225 H CEPiCfiM^t-1 DMRTC2 I GSK3R | PSG1 | PSG4 | ZNF575 | LYPD5 } ZNF224 | CEfiCflM^j RPS 19 F fiF P lH lE S | FSG 6 | CD 177 0 SRRM5 | ZNF221 \\ Z N F 2 3 3 f CD79FI | PflFflHIBSj PSG6 | TEXieiftt ZNF233H ZNF234 | C D 79A j FflFflHlB3 | PSG7 | T E X 1 0 1 j ZNF1554 ZNF235 H RRHGEF1 I F R R 19 | PSG11 (] Z N F 5 7 6 | ZNF155 j ZFF112|f) flRHG EF1 I TM EM 145 | P S G ll 4] Z N F 5 7 6 j Z N F 2 2 2 j ZNF2S5l=l [ 1 RRHGEF1 B CNFN | Z N F 4 2 8 | Z N F 2 2 2 | CEflCRM^t>| R flB flC l | L IF E H CPIDM4 |j Z N F 2 2 3 | P 0 U 2 F 2 C X C L17 | F LflU R H Z N F 2 3 4 | CEflCfiMl | FLR U R H Z N F 2 2 6 ‘ CEFICFIM1 9 F LflU R H Z N F 226 C l9 o rf6 l | Z N F 226

A) Ovorlap of the MRT7 locus with ARMR1 and ARMR2. B) Overlap of the MRT8 locus with ARMR2, 3, 6 and 7. The first three families share the same haplotype. This region is also reported by Lencz et al208 to be homozygous in 9% of 144 healthy individuals. C) Overlap of the MRT9 locus with ARMR7. D) Overlap of the MRT10 locus with ARMR1, 6, 7, 8, 10. All families share the same haplotype. This region is reported to be homozygous in 15% of 144 healthy individuals by Lencz ef a/208. E) 11 Mb SROH of ARMR1, showing 2.9 Mb overlap with the MRT11 locus. F) enlargement of the 2.9 Mb overlap containing 98 genes.

181 COLOR FIGURES

182 COLOR FIGURES

Figure 2 Coverage plots of exome experiments of sibling samples (chapter 3. page 60)

100% — ♦ — W 1 0 -1 3 3 8 ■ m....W 0 9 -2 1 6 6

90% W 1 1-3 4 7 2

— * — W 0 9 -1 1 0 9 80% — W08-0135

70% — W07-1601 — I— VV05-385 60% ------VV10-1134

------W 0 9 -0 0 7 0

50% — ♦— W08-0748

— ■ — W 07 -1 4 4 3 40% — * — W 0 9 -1 0 9 5 30% — * — W10-1180 — * — W 1 0-1 1 3 7

20% — h — W 1 0 -1 6 4 3

------W 0 7 -1 5 8 5 10% - - W11-0515

W 0 6 -0 9 8 4 0% ■— W11-3400

W 1 0 -2 7 4 9

- W 0 8 -0 1 3 5

“ W 07-1601

- W09-1095

- W 1 0 -2 7 4 9

1x 5x 10x 15x 20x 25x 30x 35x 40x

(A) Percentage of target regions with at least n fold average coverage, showing that for each DNA sample of the families more than 80% of the targets is covered more than 10 fold. (B) Percentage of target regions on the X-chromosome with at least n fold average coverage for cases with potential X-linked inheritance, showing that for each sample more than 80% of the targets is covered more than 10 fold. COLOR FIGURES

Figure 3 Validation and segregation testing of homozygous deletions (chapter 3, page 68)

A Chromosome 19 W09-1109 E r

W09-1109 ' Individual 3 é É É Ó ó 10 M 1 2 3 \ 4 5 6 7

Control

SIGLEC14 w \ \ \

Chromosome 7 & - 0

W09-2166 Individual 2 O 2 \

Control

Control

Chromosome 5

& - ©

W10-1134 Individual 2 2 \ 3

Control

Validation of homozygous deletions was done by genomic qPCR as previously described by de Leeuw et al.243 (A) Exome sequencing coverage for the proband of W09-1109 (arrow, individual 3) is shown as compared to two control samples and indicates a homozygous deletion of the last exons of SIGLEC14. The homozygous deletion is confirmed by genomic qPCR in the proband, but absent in his affected siblings (individuals 1 and 2). (B) Exome sequencing coverage for the proband of W09-2166 (arrow, individual 2) is

184 COLOR FIGURES

shown as compared to two control samples and indicates a homozygous deletion of the first exons of ZAN. The homozygous deletion is confirmed by genomic qPCR in the proband, but is heterozygous present in affected individual 1. (C) Exome sequencing coverage for the proband of W10-1134 (arrow, individual 2) is shown as compared to two control samples and indicates a homozygous deletion of the first exons of BTNL3. The homozygous deletion is confirmed by genomic qPCR in the proband, but does not segregate with the ID phenotype in individual 1 and 3.

185 COLOR F'GURES

Figure 4 Study design and pathogenic mutations in known and novel genes for intellectual disability (chapter 3, page 70)

B t Gin Ser Intron Trp Gly Leu Lys AAGTCT TTT ACCA6AGCT CT’TTATAGGGAGTCTGAA ;hr6 ^ T ACCAGAGC GTCTGAAG

CD

CD oc

I ■ ■ 1 \ M E hr-O Stop ND -/M Ser Ac.p Gin Glu His Leu Ser A-.p Gin Glu Hi; Leu TCAGACCAAGAA CACTTG TCA GAC CAAGA A CACTTG Ó É ÉÉ 1 2 \ 3 4 TDP2: -/- M/M M/M M/M

TDP2 - mutation interpretation ’

Exon 1 / \ Exon 2 SLC6A8 : H 1 f'

Gene expression profile

(A) Study design: (i) family selection represented by the pedigree of family W09-1095, (ii) exome sequencing and variants selection resulting in candidate mutations, represented by the raw sequence reads of our candidate mutation, c.425+1 G>A, in TDP2 and (iii) validation and segregation testing by

186 COLOR FIGURES

Sanger sequencing, depicted by the chromatograms showing the mutant sequence that was present in all affected male individuals and the wildtype sequence present in the unaffected sister and (iv) variant interpretation (see legend Table f for further details). (B) Pedigree of family W10-1338, showing segregation of the compound heterozygous mutations, c.1804_1805insT and c.2057delA in DDHD2. (C) Pedigree of family W08-0135, showing segregation of the hemizygous stop mutation, c,1639G>T, in SLC9A6. (D) Pedigree of family W10-2749, showing segregation of the hemizygous amino acid deletion, c.1005_1007delCAA, in SLC6A8. Arrows indicate the probands. M= mutation, M1 and M2 are the two different alleles of compound heterozygous mutations, ND= not determined, - = normal allele present.

187 COLOR FIGURES

Figure 5 Flow diagram showing step-by-step variant classification of segregating recessive sequence variants (chapter 3, page 72)

Rare variant I

Rare variants (non-synonymous, slice site and protein truncating) that segregated with the ID phenotype were classified as pathogenic if they resided either in a known ID gene, or in a novel gene in which a second pathogenic mutation was identified in a patiënt with a similar phenotype upon further study. Variants were labeled potentially pathogenic if they fulfilled four criteria: (i) the gene is already linked to a human neurologie phenotype other than ID or is not at all linked to a human phenotype as described in the online Mendelian nheritance in man database (OMIM; www.ncbi.nlm.nih.gov/OMIM/) (ii) there is mRNA expression of the respective gene in brain/neuronal tissue according to the expressed sequence tags database (www.ncbi. nlm.nih.gov/dbEST/), (iii) effect of missense variant(s) is predicted to be ‘disease causing/damaging’ by either SNPs&Go218 and/or PolyPhen-2219, and lastly, (iv) the altered amino acid is conserved amongst vertebrates. Variants that did not fulfill these four criteria where classified as likely benign.

188 COLOR FIGURES

Figure 6 Segregation of potentially pathogenic missense variants (chapter 3, page 73)

B W07-1601 □ - r - O C h - O I M/-

€ é i a 1 2 \ / 1 MCM3AP : M/M M/M BCORL1 : M

GTCGGTGGCCTCTTCACA AGGTTTCCCACTGACTGj,

è É É D Ö '

PTPR T: M/Del M/Del M/Del -/Del M/-

Chromosome 20

E]- r-O M1/-/M3 -/M2/- é É / 1 S/A/E7 : M1/M2/M3 M1/M2/M3 ZNF582: M1/M2 M1/M2

AGTATGCTGCTGAATGGA AAAACACTTGGCGGTAGT TTTGTCCTTTJAAGTGACGTC AAAGAGCCCTGGATGGTG AAGGAATGTGGGAAGGCT

(A) Pedigree of family W05-385 with Sanger confirmation of homozygous variant c.2743G>A (M) in MCM3AP. (B) Pedigree of family W07-1601 with Sanger confirmation of the hemizygous variant c.2459A>G (M) in BCORL1. (C) Pedigree of family W09-1109 with Sanger confirmation of variant c.4094C> in PTPRT. Individuals in gray have borderline intellectual functioning. Bottom panel shows the 150kb deletion on chromosome 20 in the proband, .arr snp 20q12q13.11(SNP_A-2168377->SNP_A-4194425)x1, detected with 250k SNP array testing. (D) Pedigree of family W10-1137 with sanger confirmation of variants, c.1964A>G (M1), c.9262G>A (M2) and c.11675T>C (M3) in SYNE1. (E) Pedigree of family W11-3472 with heterozygous variants c.193T>G (M1) and c.1034G>A (M2) in ZNF582.

189 COLOR FIGURES

190 COLOR FIGURES

Figure 7 Predicted effect of c.425+1G>A mutation on mRNA splicing (chapter 3, page 74)

Wildtype situation

Exon 1 / Exon 2 Exon 3 Exon 4 Exon 7

'I L ' - I AAC TA 1GTAA F r I r I A/—

Scenario A - intron 3 retention I G to A mutation Exon 1 ƒ '< Exon 2 •' \ Exon 3 Exon 4 Exon 7

Stop Scenario B - exon 3 skipping

Exon 1 / \ Exon 2 • * Exon 3 Exon 4 Exon 7 lATAA - A y — r Stop Scenario C - use of alternative splice donor site

Exon 1 / \ Exon 2 / \ Exon 3' Exon 4 Exon 7

3 ATM ■ G TA (n= r T A G | E — Stop

Scenario A will lead to exon 3 skipping, thereby introducing a premature stop codon p.Tyr84* on protein level, scenario B leads to intron 3 retention, on protein level resulting in a premature stop codon p.Leu142fs*, and scenario C shows an alternative splice donor site usage causing a shift of the reading frame, p.Gly135fs*16.

191 COLOR FIGURES

Figure 1 Pedigrees of families 1 to 4 with photographs of the face and chromatograms showing Sanger confirmation of the mutations in DDHD2 (NM 015214.2) (chapter 4, page 82)

Family 1 B Family 2

-o O r O 1 \ 2 -IM2

" w 1 2 N 1 \ 2 ^ M1/M2 M1/M2 M 1/M 2 M1/M2 M1 = p.Thr602llefs*18 M1 = p.lle463Hisfs*6 M2 = p.Glu686Glyfs*35 M2 = p.Asp660His

Thr Arg Ala Qu Sei Glu Lys Ser Phe Tyr Gin Sef ___ Intron _ Jrp Gly Leu Lyi A A GT C TT T T A CC AG A G C T C r f f A T A GG G AG TCT G A A GGG G C A A AC A TC CCC A TACCAGAGC GTCTGAAG , CA T CCC C

Family 3 D Family 4

D26 t I O 25______O D r Ó D r Ó 22 I 21 24 I 23 'É □ Ü

M IM

1 20 M = p.Arg287* M /- M/- M/- M/-

èÉ èór

192 COIOR FIGURES

Figure 1 Continued (chapter 4, page 83)

p.Thr602llefs*18 p.lle463Hisfs*6 p.Asp660His p.Arg287* p.Arg516* p.Glu686Glyfs*35

349 353 495 | | 700 N — —. WWE > SAM » DDHD 0 30 112 t 383 Lipase domain

Arrows indicate the individuals on whom exome sequencing was performed. M= mutant allele, - = wildtype allele, C = control, (A) Family 1 (family identifier: W10-1338), with an affected sister and brother and compound heterozygous frameshift mutations, c.1804_1805insT (p.Thr602llefs*18) and c.2057delA (p.Glu686Glyfs*35). (B) Family 2, with an affected sister and brother and compound heterozygous frameshift and missense mutation, c.1386dupC (p.lle463Hisfs*6) and c.1978G>C (p.Asp660His). (C) Consanguineous family 3, with seven affected individuals and a homozygous mutation c.1546C>T (p.Arg516*). Pedigree numbering is according to the original pedigree by Al-Yahyaee et a l253 (D) Consanguineous family 4 (family identifier: W12-0041) with one affected male individual and a homozygous mutation c.859C>T (p.Arg287*). (E) The protein struoture of DDHD2 including its four domains (WWE, lipase, SAM, and DDHD domain) with the position of all identified mutations indicated.

193 COLOR FIGURES

F ig u re 2 DDHD2 exome experiment (chapter 4, page 86)

DDHD2 DDHD2

B Chr8: 38109471, p.lle463Hisfs*6 Chr8: 38111160, p.Asp660His

Family 2: II-2

DDHD2 DDHD2

Raw sequencing reads of mutations in DDHD2 identified by exome sequencing and visualized with Integrative Genomics Viewer (http://www.broadinstitute.org/igv/home). A) Individual II-2 of family 1, left panel: 19 of 71 reads (26%) showed an deletion of A at chr8(hg19):g.38117560; right panel: 36 of 66 reads (55%) showed an insertion of T at chr8(hg19):g.38110558-38110559. B) Both affected siblings (11-1 and II-2) of family 2 show on the left panel a heterozygous insertion of C at chr8(hg19):g.38109471 and on the right panel a heterozygous basepair substitution G to C at chr8(hg19):g.38111160. The father (1-1) shows the heterozygous insertion (left panel), but not the substitution (left panel).

194 COLOR FIGURES

Figure 11 Oil Red O staining of Cultured fibroblasts of Controls and affected individuals (chapter 4, page 97)

A B

Family 1,11-1 Control 1

__100 um Family 2. II-2 Control 2

*F " . . . i ‘’A k. ■:*'y

f- ' ï > vvt^v*. :-vv *«' • f* ',«■,* ' ’ V • W.-*'4 . ■* V* * *• . . * • 'f- *'* V r ! / * . • . * '. V -J — lOORrn— . • ï » * , : Family 4.11*1 0(1 O control stajnirig

Representative fibroblast cells of individuals 11-1 of family 1 (A), II-2 of family 2 (B) and 11-1 of family 4 (C) and two control individuals (B,D). (F) Positive control. Cells were stained with Oil Red O to visualize lipid droplets. There was no difference observed in the number and appearance of lipid droplets between cells of affected individuals (n=3) and Controls (n=3). Cells were seeded and cultured overnight on glass slides, fixed in 3,7% formalin for 10 minutes, rinsed in demineralized water and stained for 30 minutes in filtered Oil Red O solution dissolved in isopropanol and rinsed in demineralized water followed by a brief counterstaining in hematein. After rinsing in tap water for 10 minutes the slides were sealed by using Xylol-based mounting medium.

195 COLOR FIGURES

Figure 12 Electron Micrographs (EM) of human fibroblasts of affected individuals and Controls (chapter 4, page 98)

Control 1 A

. . . SB-

4 /L/m , r I F J • $ • ■ > . 1» «-». . l i j p ^ | 11

4 ij m

»*¥;•»**• iTi"! a3B l * V«S V > V ; - 1 .

jlM s . . v S

Family 3, J IV-10

10ium Family 4, M 11-1 ■f ï

20 /jm 4/im 4 fj m

Representative eleotron micrographs of two control individuals (A-F) and individuals 11-1 of family 1 (G-l), IV-10 of family 3 (J-L) and 11-1 of family 4 (M-O). Left panels 1.2K (A,D,G,M) and 2K (J) images. Middle and right panels 6K images. For panels A-l and M-O: Spun down pellets of cultured fibroblasts were fixed in 2% glutaraldehyde in 0.1 M Phosphate buffer for 4 hours, rinsed in Phosphate buffer and postfixed for 1 hour in 1% Osmium containing 1% Kaliumhexacyanoferrat. Semithin (1 |im) sections for previewing and ultrathin (70 nm) sections were cut on an ultramicrotome, Leioa EM UC6, collected on 200mesh copper grids and contrasted with uranyl acetate and lead citrate double stain (as previously described by Sjostrand et al.273 and Reynold et al.27A). For panels J-K: Cultured fibroblasts were fixed with formalde- hyde-glutaraldehyde method as described by Karnovsky et al275. Semithin (1,30 |xm) and ultrathin (95 nm) sections were prepared and stained with 1% aqueous toluidine blue on glass slides and 200mesh Cu grids, respectively, and contrasted with uranyl acetate and lead citrate double stain273.274. All sections were examined and images generated in a Jeol JEM1200 Transmission Electron Microscope (Jeol, Netherlands). Abbreviations: N= Nucleus, Nuc=nucleolus, ER= endoplasmic reticulum, M= Mitochondria. Red asterisks indicate vacuoles, arrows indicate glycogen.

196 COLOR FIGURES

Figure 13 Confocal images of cultured fibroblast cells (chapter 4, page 99)

CONTROL FIBROBLAST CELLS FAMILY 3, IV-10 FIBROBLAST CELLS

A B

c D

E F

10um

Confocal images of representative cultured fibroblast cells from affected individual IV-10 of family 3 and a healthy control individual. Fibroblast cells were cultured on sterile cover slips, fixed by methanol, incubated with monoclonal or polyclonal anti- Calanexin, -Golgin-97 and -COPII antibodies and re-incubated with the appropriate rhodamine-labelled secondary antibodies. Finally, fixed cells were mounted in immunofluor medium (ICN Biomedicals, Irvine, California, USA). Data were acquired using a Nikon confocal microscope 1. Images, presented as single sections in the z-plane, were prepared using Adobe Photoshop (Adobe Inc.). Biomarkers Calnexin (A, B), Golgin-97 (C, D) and a COPII subunit (E, F) were used to stain respectively endoplasmic reticulum, the Golgi complex and endoplasmic reticulum exit sites in fibroblasts from individual IV-10 of family 3, and from a control. All three biomarkers show normal organelle distribution in all cells.

197 COLOR FIGURES

Figure 15 Synapse morphology and organization at the Drosophila neuro- muscular junction of control and CG8552 (Ddhd) knockdown flies (chapter 4, page 101)

p

S 260- s (öü o 240- T a> XI 3E z 220-

20a00^ O-l—0 vdrcGD60000 vdrcGD35956 vdrcGD35957 vdrcKK60100 vdrcKK108121

Three different RNAi lines, vdrcGD35956, vdroGD35957, and vdrcKK108121, from the Vienna Drosophila Research Center, were used and compared to their genetic background lines, vdrcGD60000 and vdrcKK60100, respectively. RNA interference was induced with the pan-neuronal UAS-dicer2; elav-Gal4 driver, Drosophila muscle 4 type 1b neuromuscular junctions (NMJs) were analyzed as previously described268 (A) anti-dlg1 (upper panel) and anti-brp immunolabelling (middle panel) at the NMJ of oontrol (vdrcGD60000) and Ddhd knockdown larvae (vdrcGD35956) and output of computer-assisted analysis with an in house-developed macro (bottom panel). Each white dot represents one active zone. (B) Quantification of active zones shows a significant reduction in all three RNAi lines compared to their genetic background Controls. P= P-values (two-sided T-tests), n= number of quantified synaptic terminals. Error bars indicate Standard error of the mean.

198 COLOR FIGURES

Figure 1 Photographs and genetic data of two unrelated individuals with an identical de novo mutation in PACS1 (NM 018026.2) (chapter 5, page 108)

B

Individual 1

Mother Father TACAAGaa t c >. gaccatc PACS1 Arg Tyr Lys Asn T rp Thr lle TACAAG AATC GGACCATC T Individual 1 !V\IV\ A A I\AA AAAZYa/xXA A

Mother a a /W\a a a MM a a Father m /W a AaAa a a MMA a

Individual 2

Mother Father

Arg Tyr Lys Asn Trp Thr lle TACAAG AAT C GGACCATC T Individual 2 /YxM a a M a Aa /M a A Mother /WWA a a a a a MM a a Father AA/ywwv/YxmA

RiaeRKRY CK2 binding motif s Autoregulatory domain

C-terminal region 117 T 266 p.(Arg203T rp)

(A) Upper photograph: Individual 1 at 4 years of age; low anterior hairline, highly arched eyebrows, synophrys, hypertelorism with downslant of the palpebral fissures, long eyelashes, bulbous nasal tip, flat phiitrum with thin upper lip, downturned corners of the mouth, diastema of teeth and low-set ears. Bottom photograph: Individual 2 at 12 years of age. Note the remarkable facial similarity. (B) Sequence reads from exome sequencing and chromatograms of Sanger confirmation, showing the identical de novo occurrence of the c.607C>T mutation in PACS1 in family 1 & 2. (C) Protein structure of PACS1, indicating the position of the p.Arg203Trp substitution positioned in the Furin (cargo)-binding region (FBR) of the protein, directly adjacent to the CK2-binding motif.

199 COLOR FIGURES

F igure 4 In vivo functional characterization of the p.Arg203Trp substitution in PACS1 (chapter 5, page 113)

Lateral Dorsal

Arg203

Arg203 Trp203 Arg203/Trp203

(A) Alcian blue staining of 4-day-old zebrafish larvae expressing either 50 pg wildtype (WT; C.607C resulting in p.Arg203) or 50 pg mutant (C.607T resulting in p.Trp203) PACS1 RNA. Left panel; craniofacial cartilaginous structures visualized in both lateral and ventral views of the embryo. Right panel; craniofacial phenotypes in embryos expressing WT PACS1, mutant PACS1, and both WT and mutant PACS1 combined, are quantified. White arrows and asterisks highlight Meckel’s cartilage in the lateral and ventral perspectives of the embryos. Human PACS1 WT and mutant mRNA were in vitro transcribed using an mMESSAGE mMACHINE SP6 Kit (Ambion) and 0.5 nLwas microinjected into 2-4 cell stage zebrafish embryos.

2 0 0 COLOR FIGURES

Figure 4 Continued (chapter 5, page 114)

Trunk Head Arg203 i i i i 1 * 1 * • v ! ! BA ** !

Trp203 1 1 1 1 1 1 1 1 1 1 1 1 1 1 SOX10::GFP

i Normal i Craniofacial abnormalities

Arg203 Trp203 Arg203/Trp203

(B) Imaging of 4-day-old sox10::eGFP zebrafish larvae expressing either 50 pg WT or 50 pg mutant PACS1 RNA. Left panel; migration of eGFP labeled CNCCs. Right panel; CNCC migration phenotype scored in embryos expressing WT PACS1, mutant PACS1 and both WT and mutant PACS1 combined.

201 COLOR FIGURES

Figure 5 Immunoblot of protein stability and flatmounting of zebrafish embryos (chapter 5, page 115)

(A) Representative immunoblot showing protein stability levels for wildtype (p.Arg203) and mutant (p.Trp203) PACS1. Lysates were obtained from 0, 2, 4, and 6 hrs treatment of cyolohexamide. Experiments were performed in triplicate. (B) Light micrograph images showing flatmounted, alcian blue stained 4-dpf embryos injected with wildtype or mutant PACS1.

2 0 2 COLOR FIGURES

F ig u re 6 PACS1 mRNA levels in PACS1 injected zebrafish (chapter 5, page 116)

CU O) sz CD SZ O 33 .o 0 > CL tu o w ü: .e

Quantification of Sox10::GFP mRNA levels in PACS1 injected zebrafish. RNA isolated from trunk and head regions of wildtype (WT; C.607C) versus mutant (c.607C>T resulting in p.Arg203Trp) PACS1 injected zebrafish was used for real-time qPCR analysis. Student t-test was performed; * ** p < 0.001. Error bars represent Standard deviation.

203 COLOR FIGURES

F ig u re 7 In vitro functional characterization of the p.Arg203Trp substitution in PACS1 (chapter 5, page 117)

A Arg203 Trp203

Time after CHX treatment

PACS1 Arg203-GFP + PACS1 Trp203-GFP + TRPV4v1-V5 + TRPV4v2-V5 (kDa)

IP: GFP IB: V5 -95

IP: GFP IB: GFP ■95

(A) Localization of GFP-wild type (GFP-WT) and GFP-mutant (GFP-Trp203) PACS1 in transfected ARPE-19 cells grown to confluence and stained with a GFP antibody. ARPE-19 cells were grown in Dulbecco's Modified Eagle Medium and Flam’s F-12 Nutriënt 1:1 mixture (DMEM/F-12, Invitrogen) with 10% FBS and 2 mM

204 COLOR FIGURES

L-glutamine. Transfection of PACS1 WT and mutant plasmids was carried out using FuGene6 Transfection Reagent (Roche). Cells were fixed with 4% PFA 72hrs after transfection and probed with an anti-GFP antibody (Santa Cruz, sc-8334), followed by a secondary antibody Alexa Fluor 488 IgG (Invitrogen). (B) Quantification of WT and p.Trp203 PACS1 protein stability in transfected cells treated with CHX. Represented is the mean measurement of triplicate experiments, with the error bar representing the Standard error of the mean. HEK 293FT cells were grown in Dulbecco's Modified Eagle Medium (DMEM, Invitrogen) containing 10% Fetal Bovine Serum (Invitrogen) and 2 mM L-glutamine (Invitrogen). Cells were treated with 50 mM Cycloheximide (Sigma) for 6 hours and harvested in co-IP buffer. (C) Immunoprecipitation of GFP-tagged WT and mutant PACS1 and V5-tagged TRPV4v1 (NM_021625) and TRPV4V2 (NMJ47204). HEK293 cells were transfected with tagged constructs and harvested in co-IP buffer after 48hrs. Immunoprecipitation was performed with an anti-GFP antibody and immunoblotted with a V5 antibody.

205 OO FIGURES COLOR 6 0 2 gur Fo iga o lsiiaino aevrat nidvdaswt ID with individuals in variants rare of classification for diagram Flow 1 re u ig F Research Diagnostics catr6 ae 126) page 6, (chapter THESIS SERIES OF THE INSTITUTE FOR GENETIC AND METABOLIC DISEASE

Thesis series of the Institute for Genetic and MetaboDic Disease

Radboud University Medical Centre

1. Guillard, M. (2012). Biochemical and clinical investigations in the diagnosis of congenital disorders of glycosylation. Radboud University Nijmegen, Nijmegen, The Netherlands.

2. Markus, K. (2012). Strain or no strain. The application of two-dimensional strain echocardiography in children. Radboud University Nijmegen, Nijmegen, The Netherlands.

3. Jonckheere, A. (2012). Mitochondrial Medicine: assay development and application with special emphasis on human complex V. Radboud University Nijmegen, Nijmegen, The Netherlands

4. Wessels, J. (2012). Mitochondrial proteomics: method development and application. Radboud University Nijmegen, Nijmegen, The Netherlands

5. Vries de, H. (2012). Risk assessment of biological treatment in inflammatory bowel disease & analysis of genetic susceptibility factors. Radboud University Nijmegen, Nijmegen, The Netherlands

6. Brom, M. (2013). Development of atracerto image pancreatic beta cells. Radboud University Nijmegen, Nijmegen, The Netherlands.

7. Schuurs-Hoeijmakers, J.H.M. (2012). Gene identification in intellectual disability. Radboud University Nijmegen, Nijmegen, The Netherlands.

207