AMYOTROPHIC LATERAL SCLEROSIS \ GENETIC SUSCEPTIBILITY FACTORS AND PLEIOTROPY

Frank P Diekstra English title Amyotrophic lateral sclerosis: genetic susceptibility factors and pleiotropy Nederlandse titel Amyotrofische laterale sclerose: genetische risicofactoren en pleiotropie Cover design & layout Nadine Reef / www.nadinereef.nl Printing Ridderprint BV

ISBN 978-94-6299-180-4

© 2015 F.P. Diekstra All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage or retrieval system, without permission in writing from the author (where appropriate). AMYOTROPHIC LATERAL SCLEROSIS: GENETIC SUSCEPTIBILITY FACTORS AND PLEIOTROPY

AMYOTROFISCHE LATERALE SCLEROSE: GENETISCHE RISICOFACTOREN EN PLEIOTROPIE

(met een samenvatting in het Nederlands)

PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof.dr. G.J. van der Zwaan, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op dinsdag 27 oktober 2015 des ochtends te 10.30 uur door

Frank Paul Diekstra geboren op 4 augustus 1983 te Nijmegen Promotoren: Prof. dr. L.H. van den Berg Prof. dr. J.H. Veldink

Maintenant, que le malade n'est plus là, nous pouvons et nous devons nous parler en toute franchise. Les remêdes les plus divers et dont l'emploi paraît le plus rationnel, seront impuissants à retarder la marche progressive du mal. C'est triste à dire, mais c'est comme cela: Pour le médecin, il ne s'agit pas de savoir si cela est triste, il s'agit de savoir si cela est vrai. On a l'air de nous reprocher quelquefois nos persévérantes études sur les grandes maladies nerveuses jusqu'à présent le plus souvent incurables; à quoi cela sert-il? Allons, notre devoir est autre: cherchons, malgré tout, cherchons toujours; c'est encore le meilleur moyen de trouver et peut-être, grâce à nos efforts, le verdict de demain ne sera-t-il pas le verdict d'aujourd'hui?

— Jean-Martin Charcot, Policlinique du Mardi 28 Février 1888 TABLE OF CONTENT

Introduction 9 ..... Chapter 1

PART I: GENETIC SUSCEPTIBILITY FACTORS FOR ALS

Interaction between PON1 and population density in 21 amyotrophic lateral sclerosis ..... Chapter 2

A case of ALS-FTD in a large FALS pedigree with a 31 K17I ANG mutation ..... Chapter 3

Mapping of expression reveals CYP27A1 as a 39 susceptibility gene for sporadic ALS ..... Chapter 4

PART II: GENETIC PLEIOTROPY

No evidence for shared genetic basis of common 65 variants in multiple sclerosis and amyotrophic lateral sclerosis ..... Chapter 5

6

C9orf72 and UNC13A are shared risk loci for ALS 79 and FTD: A genome-wide meta-analysis ..... Chapter 6

PART III: GENETIC DISEASE MODIFIERS

UNC13A is a modifier of survival in amyotrophic 107 lateral sclerosis ..... Chapter 7

Genetic modifiers in C9orf72 repeat expansion 119 carriers: a genome-wide analysis ..... Chapter 8

Summary and general discussion 135 ..... Chapter 9

Nederlandse samenvatting (summary in Dutch) 151 ..... Chapter 10

Dankwoord 157 List of publications 163 Curriculum vitae 169

7

1

INTRODUCTION

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterized by progressive mus- cle weakness, spasticity, dysarthria and, ultimately, respiratory muscle insufficiency. These symptoms are caused by the loss of motor neurons in both the brain and in the anterior horn of the spinal cord. The French neurologist Jean-Martin Charcot first described amyotrophic lateral sclerosis in 1869.1 The disorder is also referred to as motor neuron disease or Lou Gehrig’s disease (named after the famous American baseball player who died of ALS). Symptoms most often start in one body region, for example with hand muscle atrophy or slurry speech, and subsequently progress to other parts of the body. Population-based studies have estimated an incidence rate of 2.16-2.8 per 100,000 person-years in the European population and the prevalence is about 4-10 per 100,000.2-4 The lifetime risk of ALS is about 1:350-400. There is a male preponderance in a male-to-female ratio of about 1.25-1.4.2, 3 The peak incidence for ALS lies between 70-75 years in males, and somewhat lower in females: between 65-70 years of age.3, 5, 6 After the age of 80 years, incidence rates drop rapidly. ALS is a rapidly progressive and fatal disease. The median survival time from onset is approximate- ly three years.7, 8 To date, no cure is available, and denervation of the respiratory muscles or dysphagia lead- ing to respiratory complications are the most common fatal events.9 Unfavorable prognostic factors include older age at onset, bulbar onset, a low body mass index (BMI) and poor nutritional status, concomitant cognitive decline, and respiratory function.10 Riluzole, a glutamate inhibitor, is the only drug with a proven effect, prolonging survival with on average 2-3 months.11 Classically, ALS is divided in two forms: sporadic ALS or sALS (in which there is no apparent family history of ALS) and familial ALS (fALS), defined as the presence of at least one affected first or second de- gree relative. Approximately 5-10% of ALS cases are designated as familial ALS.12 The distinction between sporadic and familial ALS has mainly been useful for genetic linkage studies, as familial ALS patients ap- pear to have a higher frequency of monogenetic or oligogenetic causes, showing a Mendelian inheritance pattern. Clinically, however, sporadic ALS and familial ALS are usually indistinguishable, although there are some forms with a younger age at onset or with additional neurological symptoms such as parkinsonism or cognitive symptoms.13 Recent advances in the study of genetic causes of ALS have identified the same familial ALS mutations in a proportion of sporadic ALS patients. Furthermore, apparently sporadic cases might actually carry mutations in ‘familial’ ALS , because of an unreliable family history, small pedi- gree size, or incomplete gene penetrance.14 Therefore, the question rises whether this distinction between sporadic and familial ALS still holds. The etiology of sporadic ALS remains largely unknown. The general view is that sporadic ALS is a disorder of complex etiology, in which both environmental factors and genetic factors play a role. Multiple environmental factors have been studied, of which many remain controversial. Smoking appears to be the most established risk factor for ALS.15, 16 Alcohol consumption may have no or even a protective effect.17 Furthermore, there is evidence for an increased risk of ALS in persons with intensive physical exercise,

10 INTRODUCTION among Gulf War veterans, persons exposed to pesticides, persons exposed to heave metals like lead or certain occupational hazards like welding or electricity.6, 18-23 The latter risk factors, however, remain con- 1 troversial.23-25 In the past years interest for the genetic causes of ALS has grown tremendously. Studies of twin pairs discordant for ALS have estimated a considerable heritability of around 60%.26, 27 Heritabili- ty estimates using genome-wide data appear to be lower (11-21%), possibly due to statistical aspects or because these calculations are based on common variants and do not account for the proportion of disease-causing rare variants.28-30 The first causative gene for ALS, copper/zinc superoxide dismutase 1 (SOD1), was discovered in 1993 in several large pedigrees with familial ALS.31 First through linkage studies, using genetic markers across the genome in affected and unaffected family members, and more recently by using next-generation sequencing techniques, additional causative genes have been discovered in fa- milial ALS. These genes include ALS2, SETX, SPG11, FUS, VAPB, ANG, TARDBP, FIG4, OPTN, VCP, UBQLN2, SIG- MAR1, PFN1, C9orf72, MATR3, CHCHD10, and TBK1 (Figure 1). The familial ALS genes are involved in different physiological processes, including endosome trafficking, RNA transcription or processing, neurofilament formation, and axonal transport. Most of the mutations in aforementioned genes have a dominant mode of inheritance, although mutations in ALS2 and SPG11 usually are recessive, and mutations in UBQLN2 are X-linked.13 As noted previously, the clinical presentation of patients carrying mutations in these familial ALS genes is mostly indistinguishable from other (for example sporadic) ALS patients. Exceptions are ALS2 and SPG11, which are associated with juvenile ALS; FUS and ANG, which are associated with ALS and parkin- sonism; and C9orf72, which also causes (FTD).13 The frequency and distribution of familial ALS gene mutations can vary strongly across different populations. For example,SOD1 mutations are frequent in Scandinavian, Belgian and US ALS pedigrees, but are a rare cause of ALS in The Nether- lands.32 Also, in Belgium about all familial ALS cases have been explained by known ALS genes, while in The Netherlands we know the causative gene defect in approximately 65% of familial cases (Figure 2). By contrast, the genetic causes of sporadic ALS remain far more elusive. Mutations in some of the familial ALS genes have also been found in a small proportion of the sporadic ALS patients. However, these can explain only 6-15% of the sporadic ALS cases (Figure 2). Earlier genetic studies in ALS have mainly used a candidate gene approach, based on proposed pathogenic pathways or known interactions between a gene and environmental factors.33-36 Examples of genes identified by such candidate gene approaches are SMN, HFE, PON1, VCP, and VEGF. However, not all of these associations have been successfully replicat- ed.13 With the development of new genotyping techniques for faster and cheaper assessing genetic variation across the genome, the first genome-wide association studies in sporadic ALS emerged. With chip-based genotyping arrays hundreds of thousands of single nucleotide polymorphisms (SNPs) could be genotyped in a single experiment. This allowed for the testing of associations between a large number of common variants (with a minor allele frequency (MAF) in the general population of 1-5% and higher)

11 Figure 1 Timeline of gene discoveries in familial and sporadic ALS

fALS = familial ALS; sALS = sporadic ALS

Figure 2 Proportions of mutations identified in familial and sporadic ALS

Proportions are based Caucasian study populations. fALS = familial ALS; sALS = sporadic ALS

in so-called genome-wide association studies (GWAS). Following the common disease — common variant hypothesis, these GWASs test for associations with genetic variants that have low penetrance, but are relatively frequent in the population.37 These common variants, on the other hand, might tag haplotypes

12 INTRODUCTION containing much rarer variants with a large effect on disease susceptibility. One great advantage of a GWAS is the hypothesis-free assessment of the genome, thus greatly expanding the scope of possible pathogenic 1 candidates. Such a great number of association tests, however, is penalized by the need for multiple-tests correction. Therefore, in order to identify associations with small effect sizes, and typically with odds ratios of less than 2.0, GWASs require large sample sizes in order to achieve sufficient statistical power. In 2007, the first genome-wide association studies in ALS were published, one of which identi- fied ITPR2 as a susceptibility gene, although this association has not been replicated in later studies.38-40 Subsequent GWASs in sporadic ALS have implicated several other susceptibility loci, including DPP6, FGGY, UNC13A, 9p21.2.41-43 Other GWASs have implicated disease-modifying loci in ALS, such as KIFAP3 (associated with survival) and chromosome 1p34.1 (associated with age at onset).44, 45 However, replication has proven difficult and only the association with the chromosome 9p21.2 locus has been consistently replicated in independent cohorts.46-51 Later, in 2011, C9orf72 was discovered to be the caus- ative gene within the chromosome 9p21.2 locus, bridging quite conclusively the gap between familial and sporadic ALS.52, 53 The identification of a hexanucleotide repeat expansion inC9orf72 as the cause for chromosome 9p-linked ALS and FTD has been a major breakthrough. The chromosome 9p21.2 locus was first identified in linkage studies in families with both ALS and FTD.54-56 Subsequently, GWASs have been able to fine-map the locus to three genes, of which C9orf72 was ultimately discovered to harbor the causal variant. This discovery forms an important genetic link between ALS with pure motor neuron symptoms and cognitive symptoms in FTD. The function of C9orf72, to date, is unclear. Recent reports have suggested that the func- tion of the may be related to DENN-like , which are involved in vesicular trafficking.57, 58 The repeat expansion in C9orf72 may either lead to a detrimental loss-of-function or to a toxic gain-of-function. Also, through a mechanism called repeat-associated non-ATG (RAN) translation, different polypeptides are produced forming neuronal inclusions that have been identified throughout the central nervous system in C9orf72-related ALS and FTD patients.59 Frontotemporal dementia is a cortical-type dementia characterized by changes in cognition, be- havior and language, in contrast to Alzheimer’s disease in which loss of memory function forms the hall- mark. Brain imaging studies typically show frontal and temporal lobe atrophy. Population incidence rates are comparable to those in ALS (approximately 3 in 100,000 person-years). Approximately 6% of sporadic ALS and FTD patients carry the expanded C9orf72 repeat, while in familial ALS and FTD 37% and 25% of cases have the repeat expansion, respectively. In summary, although genetic factors appear to play a considerable role in the etiology of ALS, a large part of the estimated heritability has not yet been accounted for. Sporadic ALS is considered a trait of complex etiology, in which environmental factors may interplay with genetic risk factors and, together, reach a ‘liability threshold’ that triggers motor neuron degeneration. Genetic studies in ALS have evolved from candidate gene approaches to hypothesis-free genome-wide association studies. Ultimately, one of

13 the most important genetic causes of ALS, repeat expansions in C9orf72, has demonstrated that genetic links exist to other neurological disorders, for example FTD.

AIMS OF THIS THESIS THIS THESIS AIMS: - to identify genetic susceptibility factors for ALS in candidate genes or by using a genome-wide mapping of gene expression; - to identify genetic susceptibility factors for sporadic ALS that are shared with other neurological disorders in order to elucidate pathogenic pathways underlying neurodegeneration; - to identify genetic disease modifiers for ALS that may provide insight into pathogenic mechanisms or provide possible therapeutic targets to change the onset or course of ALS.

In Part I, genetic susceptibility factors are investigated in a candidate gene approach by exploring a gene- environment interaction for the paraoxonase 1 gene (PON1) in Chapter 2 and by investigating ANG muta- tions in familial ALS cases (Chapter 3). Chapter 4 describes the integration of genome-wide expression profiles and genome-wide SNP genotypes in order to identify additional susceptibility loci not identified in previous GWASs. Part II focuses on a possible overlap in genetic risk factors between ALS and other neurological disorders (genetic pleiotropy). In Chapter 5, shared risk factors for ALS and multiple sclerosis (MS) are explored, while we performed a genome-wide meta-analysis in Chapter 6 to identify shared risk loci for ALS and FTD. Ultimately, Part III aims at the identification of genetic disease modifiers. As a follow-up on the results of a previous GWAS, we investigated whether UNC13A might modify survival in ALS patients (Chapter 7). In Chapter 8, we collected genome-wide data from ALS and FTD patients with C9orf72 repeat expansions and investigated possible ‘genetic switches’ that may determine the disease phenotype.

14 INTRODUCTION

REFERENCES 1. Charcot JM, Joffroy A. Deux cas d’atrophie musculaire progressive: avec lésions de la substance grise et 1 des faisceaux antérolatéraux de la moelle épinière. Archives de physiologie normale et pathologique 1869;2:744-760. 2. Logroscino G, Traynor BJ, Hardiman O, Chiò A, Mitchell D, et al. Incidence of amyotrophic lateral sclerosis in Europe. J Neurol Neurosurg Psychiatr 2010;81:385-390. 3. Huisman MHB, de Jong SW, van Doormaal PTC, Weinreich SS, Schelhaas HJ, et al. Population based epi demiology of amyotrophic lateral sclerosis using capture-recapture methodology. J Neurol Neurosurg Psychiatr 2011;82:1165-1170. 4. Mehta P, Antao V, Kaye W, Sanchez M, Williamson D, et al. Prevalence of amyotrophic lateral sclerosis - United States, 2010-2011. MMWR Surveill Summ 2014;63 Suppl 7:1-14. 5. Abhinav K, Stanton B, Johnston C, Hardstaff J, Orrell RW, et al. Amyotrophic lateral sclerosis in South-East England: a population-based study. The South-East England register for amyotrophic lateral sclerosis (SEALS Registry). Neuroepidemiology 2007;29:44-48. 6. McGuire V, Longstreth WT, Nelson LM, Koepsell TD, Checkoway H, et al. Occupational exposures and amyotrophic lateral sclerosis. A population-based case-control study. Am J Epidemiol 1997;145:1076- 1088. 7. del Aguila MA, Longstreth WT, McGuire V, Koepsell TD, van Belle G. Prognosis in amyotrophic lateral sclerosis: a population-based study. Neurology 2003;60:813-819. 8. Kiernan MC, Vucic S, Cheah BC, Turner MR, Eisen A, et al. Amyotrophic lateral sclerosis. Lancet 2011;377:942-955. 9. Hardiman O, van den Berg LH, Kiernan MC. Clinical diagnosis and management of amyotrophic lateral sclerosis. Nat Rev Neurol 2011;7:639-649. 10. Chio A, Logroscino G, Hardiman O, Swingler R, Mitchell D, et al. Prognostic factors in ALS: A critical review. Amyotroph Lateral Scler 2009;10:310-323. 11. Miller RG, Mitchell JD, Moore DH. Riluzole for amyotrophic lateral sclerosis (ALS)/motor neuron disease (MND). Cochrane Database Syst Rev 2012;3:CD001447. 12. Byrne S, Walsh C, Lynch C, Bede P, Elamin M, et al. Rate of familial amyotrophic lateral sclerosis: a systematic review and meta-analysis. J Neurol Neurosurg Psychiatr 2011;82:623-627. 13. Andersen PM, Al-Chalabi A. Clinical genetics of amyotrophic lateral sclerosis: what do we really know? Nat Rev Neurol 2011;7:603-615. 14. Chiò A, Battistini S, Calvo A, Caponnetto C, Conforti FL, et al. Genetic counselling in ALS: facts, uncertainties and clinical suggestions. J Neurol Neurosurg Psychiatr 2014;85:478-485. 15. Sutedja NA, Veldink JH, Fischer K, Kromhout H, Wokke JHJ, et al. Lifetime occupation, education, smoking, and risk of ALS. Neurology 2007;69:1508-1514. 16. Armon C. Smoking may be considered an established risk factor for sporadic ALS. Neurology 2009;73:1693-1698. 17. de Jong SW, Huisman MHB, Sutedja NA, Van Der Kooi AJ, de Visser M, et al. Smoking, alcohol consumption, and the risk of amyotrophic lateral sclerosis: a population-based study. Am J Epidemiol2012;176:233-239. 18. Chiò A, Benzi G, Dossena M, Mutani R, Mora G. Severely increased risk of amyotrophic lateral sclerosis among Italian professional football players. Brain 2005;128:472-476.

15 19. Huisman MHB, Seelen M, de Jong SW, Dorresteijn KRIS, van Doormaal PTC, et al. Lifetime physical activity and the risk of amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatr 2013;84:976-981. 20. Weisskopf MG, O’Reilly EJ, McCullough ML, Calle EE, Thun MJ, et al. Prospective study of military service and mortality from ALS. Neurology 2005;64:32-37. 21. Haley RW. Excess incidence of ALS in young Gulf War veterans. Neurology 2003;61:750-756. 22. Park RM, Schulte PA, Bowman JD, Walker JT, Bondy SC, et al. Potential occupational risks for neurodege- nerative diseases. Am J Ind Med 2005;48:63-77. 23. Ingre C, Roos PM, Piehl F, Kamel F, Fang F. Risk factors for amyotrophic lateral sclerosis. Clin Epidemiol 2015;7:181-193. 24. Veldink JH, Kalmijn S, Groeneveld GJ, Titulaer MJ, Wokke JHJ, et al. Physical activity and the association with sporadic ALS. Neurology 2005;64:241-245. 25. Hamidou B, Couratier P, Besançon C, Nicol M, Preux PM, et al. Epidemiological evidence that physical activity is not a risk factor for ALS. Eur J Epidemiol 2014;29:459-475. 26. Al-Chalabi A, Fang F, Hanby MF, Leigh PN, Shaw CE, et al. An estimate of amyotrophic lateral sclerosis heritability using twin data. J Neurol Neurosurg Psychiatr 2010;81:1324-1326. 27. Graham AJ, Macdonald AM, Hawkes CH. British motor neuron disease twin study. J Neurol Neurosurg Psychiatr 1997;62:562-569. 28. Fogh I, Ratti A, Gellera C, Lin K, Tiloca C, et al. A genome-wide association meta-analysis identifies a novel locus at 17q11.2 associated with sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2014;23:2220- 2231. 29. Keller MF, Ferrucci L, Singleton AB, Tienari PJ, Laaksovirta H, et al. Genome-wide analysis of the heritability of amyotrophic lateral sclerosis. JAMA Neurol 2014;71:1123-1134. 30. McLaughlin RL, Vajda A, Hardiman O. Heritability of Amyotrophic Lateral Sclerosis: Insights From Disparate Numbers. JAMA Neurol 2015;72:857-858. 31. Rosen DR, Siddique T, Patterson D, Figlewicz DA, Sapp P, et al. Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature 1993;362:59-62. 32. van Es MA, Dahlberg C, Birve A, Veldink JH, van den Berg LH, et al. Large-scale SOD1 mutation screening provides evidence for genetic heterogeneity in amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatr 2010;81:562-566. 33. Kasperaviciute D, Weale ME, Shianna KV, Banks GT, Simpson CL, et al. Large-scale pathways-based association study in amyotrophic lateral sclerosis. Brain 2007;130:2292-2301. 34. Slowik A, Tomik B, Wolkow PP, Partyka D, Turaj W, et al. Paraoxonase gene polymorphisms and sporadic ALS. Neurology 2006;67:766-770. 35. Cronin S, Greenway MJ, Prehn JHM, Hardiman O. Paraoxonase promoter and intronic variants modify risk of sporadic amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatr 2007;78:984-986. 36. Saeed M, Siddique N, Hung WY, Usacheva E, Liu E, et al. Paraoxonase cluster polymorphisms are associated with sporadic ALS. Neurology 2006;67:771-776. 37. Gibson G. Rare and common variants: twenty arguments. Nat Rev Genet 2011;13:135-145. 38. van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol 2007;6:869-877. 39. Fernández-Santiago R, Sharma M, Berg D, Illig T, Anneser J, et al. No evidence of association of FLJ10986

16 INTRODUCTION

and ITPR2 with ALS in a large German cohort. Neurobiol Aging 2011;32:551.e551-554. 40. Schymick JC, Scholz SW, Fung H-C, Britton A, Arepalli S, et al. Genome-wide genotyping in amyotrophic 1 lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol 2007;6:322-328. 41. van Es MA, van Vught PWJ, Blauw HM, Franke L, Saris CGJ, et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet 2008;40:29-31. 42. Dunckley T, Huentelman MJ, Craig DW, Pearson JV, Szelinger S, et al. Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med 2007;357:775-788. 43. van Es MA, Veldink JH, Saris CGJ, Blauw HM, van Vught PWJ, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087. 44. Landers JE, Melki J, Meininger V, Glass JD, van den Berg LH, et al. Reduced expression of the Kinesin- Associated Protein 3 (KIFAP3) gene increases survival in sporadic amyotrophic lateral sclerosis. Proc Natl Acad Sci USA 2009;106:9004-9009. 45. Ahmeti KB, Ajroud-Driss S, Al-Chalabi A, Andersen PM, Armstrong J, et al. Age of onset of amyotrophic lateral sclerosis is modulated by a locus on 1p34.1. Neurobiol Aging 2013;34:357.e357-319. 46. van Es MA, van Vught PWJ, Veldink JH, Andersen PM, Birve A, et al. Analysis of FGGY as a risk factor for sporadic amyotrophic lateral sclerosis. Amyotroph Lateral Scler 2009;10:441-447. 47. Chiò A, Schymick JC, Restagno G, Scholz SW, Lombardo F, et al. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2009;18:1524-1532. 48. Cronin S, Tomik B, Bradley DG, Slowik A, Hardiman O. Screening for replication of genome-wide SNP associations in sporadic ALS. Eur J Hum Genet 2009;17:213-218. 49. Laaksovirta H, Peuralinna T, Schymick JC, Scholz SW, Lai SL, et al. Chromosome 9p21 in amyotrophic lateral sclerosis in Finland: a genome-wide association study. Lancet Neurol 2010;9:978-985. 50. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 51. Traynor BJ, Nalls M, Lai S-L, Gibbs RJ, Schymick JC, et al. Kinesin-associated protein 3 (KIFAP3) has no effect on survival in a population-based cohort of ALS patients.Proc Natl Acad Sci USA 2010;107:12335- 12338. 52. Renton AE, Majounie E, Waite A, Simón-Sánchez J, Rollinson S, et al. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011;72:257-268. 53. Dejesus-Hernandez M, Mackenzie IR, Boeve BF, Boxer AL, Baker M, et al. Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 2011;72:245-256. 54. Gijselinck I, Engelborghs S, Maes G, Cuijt I, Peeters K, et al. Identification of 2 Loci at 9 and 14 in a multiplex family with frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Arch Neurol 2010;67:606-616. 55. Le Ber I, Camuzat A, Berger E, Hannequin D, Laquerrière A, et al. Chromosome 9p-linked families with frontotemporal dementia associated with motor neuron disease. Neurology 2009;72:1669-1676.

17

PART I Genetic susceptibility factors for ALS

2

INTERACTION BETWEEN PON1 AND POPULATION DENSITY IN AMYOTROPHIC LATERAL SCLEROSIS

NEUROREPORT. 2009;20(2):186-90

Frank P Diekstra, Ana Beleza-Meireles, Christopher E Shaw, P Nigel Leigh, Ammar Al-Chalabi ABSTRACT Paraoxonase polymorphisms have been associated with amyotrophic lateral sclerosis (ALS). Paraoxonases are detoxifying enzymes involved in the metabolism of organophosphates. We tested the hypothesis that genetic variation within paraoxonase genes would interact with the environmental exposure to paraoxonase substrates. We used population density in the location of residence of ALS patients as a surrogate marker for environmental exposure. Paraoxonase genotypes at previously associated single nucleotide polymorphisms rs662, rs854560, rs6954345 and rs11981433 were studied in 98 patients from the South East England ALS population-based register. A case-only analysis was carried out and median population density was used to categorize patients into rural or urban environments. We found a significant interaction with population density for marker rs854560 (L55M) in ALS.

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is thought to be a disease of complex etiology in which both genetic and environmental factors contribute to pathogenesis. Numerous epidemiological studies have investigated en- vironmental risk factors for ALS. A greater incidence of ALS has been reported in individuals with a history of intensive physical activity1, professional football2, cigarette smoking3, and exposure to heavy metals such as lead4, although these remain controversial. In addition, pesticide and herbicide exposure has been impli- cated in ALS, and for example an increased risk has been reported in rural residents or agricultural workers who have been exposed to these substances.5,6 An excess incidence of ALS (or an ALS-like syndrome) has been found in Gulf War veterans, and it has been hypothesized that exposure to pesticides used in their tents or nerve gases may have contributed.7 Considering the oxidative-stress hypothesis in sporadic ALS and the previously described environ- mental factors, the paraoxonases are interesting gene candidates. The paraoxonase family consists of three adjacently located genes named PON1, PON2, and PON3 on chromosome 7q21.3, coding for esterase enzyme.8 All three enzymes possess anti-oxidative properties.9 Four previous studies have investigated polymorphisms within the paraoxonase gene family in ALS.8,10-12 Each study found a different paraoxonase variant to be associated. So far, thePON1 promoter single nucle- otide polymorphism (SNP) rs705379 (−108c > t)11, the PON1 coding SNPs rs662 (576a > g, Q192R)8 and rs854560 (163t > a, L55M)10, the PON2 coding SNP rs6954345 (926c > g, C311S)8 and a hap- lotype covering PON2 and PON312 have been associated with ALS. Although ALS susceptibility because of paraoxonase variants is very likely to be linked to environmental factors, only one study investigated a gene-environment interaction.11 The study found a significant interaction between self-reported pesticide exposure and paraoxonase polymorphisms for three PON1 promoter SNPs, including −108c>t in an Aus- tralian population. However, only allelic tests reached significance and the findings did not extend to the

22 INTERACTION BETWEEN PON1 AND POPULATION DENSITY IN ALS genotype or haplotype level. No gene-environment interaction was found for the L55M polymorphism.11 Variations in paraoxonase 1 can modify the rate of detoxification of pesticides.13,14 We hypothesized that ALS patients living near agricultural fields would be more exposed to pesticides than patients in an urban area and that paraoxonase polymorphisms might modify their susceptibility to pesticide toxicity. We there- fore investigated a gene-environment interaction between paraoxonase gene polymorphisms and popu- lation density in ALS patients for four previously associated SNPs in a population-based case-only study 2 design.

METHODS PATIENTS Patients for this study were selected from the South East England ALS registry.15 The catchment area for this population-based study consists of 26 postcode regions including seven South East London boroughs and nineteen local authorities in the counties of Brighton and Hove, East Sussex and Kent. The registry identified 471 patients with ALS in the period from 1 January 2002 to 30 June 2006.15 This study was approved by the Institutional Research Ethics Committee (222/02). Data on smoking and exercise were not available for covariate analysis.

GENOTYPES Genotypes for SNPs rs662, rs854560, rs6954345 and rs11981433 were determined using a 1536- plex GoldenGate assay on an Illumina BeadArray station as described earlier.16

POPULATION DENSITY Population density data for each of the 26 postcode regions in 2002 were obtained from the UK Office for National Statistics.17

STATISTICAL ANALYSIS As genotype data were available for a selected number of cases within the South East England ALS reg- istry, patient characteristics of the current study population were compared to the full South East England ALS registry population. Dichotomous variables were tested by using Pearson’s χ2 test; normally distributed variables were tested with the independent samples t-test; and the Mann-Whitney test was used for contin- uous variables with a non-normal distribution. SNPs were tested for deviation from Hardy-Weinberg equilibrium by using an exact test18 in the program PLINK v1.01 (Shaun Purcell, http://pngu.mgh.harvard.edu/purcell/plink). To test for gene-environment interaction, population density data were dichotomized. The median (1272 people/km2) was used as the cut-off point, as this value had the highest discriminating value when tested in a receiver-operator curve for each SNP. Areas with high population densities were referred to

23 as urban; low population density areas were designated as rural. Interaction with SNPs was tested with Fisher’s exact test at both the genotypic and allelic levels as well as with the Cochran-Armitage trend test. Association analyses were carried out by using the software packages SPSS v15.0 (SPSS Inc) and PLINK. For haplotypic association tests, pairwise linkage disequilibrium values for the selected SNPs were determined and haplotypes estimated with the program Haploview v4.0

Table 1 Patient characteristics

South East England ALS registry Study population n = 471 n = 98 p Age of onset, mean (y) 59.3 61.3 0.160 * Gender, female 190 (40.3 %) 46 (46.9 %) 0.228 † Site of onset, bulbar 143 (30.4 %) 36 (36.7 %) 0.216 † Survival, median (y) 2.8 2.7 0.655 ‡ Family history, none 397 (84.3 %) 88 (89.8 %) 0.250 † Population density, median 1977 1272 0.900 ‡ (people/km2) *, independent samples t-test; †, 1df Pearson χ2 test; ‡, Mann-Whitney test

Table 2 Gene-environment interaction statistics of the genotyped polymorphisms

model rural (n=49) urban (n=49) p p corrected† PON1 Q192R (rs662) Genotypic * QQ vs QR vs RR 30/17/2 19/22/8 0.043 0.17 Cochran-Armitage Q vs R (trend) 77/21 60/38 0.0099 0.04 Allelic Q vs R 77/21 60/38 0.012 Dominant QQ vs QR+RR 30/19 19/30 0.043 Recessive QQ+QR vs RR 47/2 41/8 0.091

PON1 L55M (rs854560) Genotypic * LL vs LM vs MM 18/19/12 25/24/0 0.00048 0.0019 Cochran-Armitage L vs M (trend) 55/43 74/24 0.0047 0.019 Allelic L vs M 55/43 74/24 0.0065 Dominant LL vs LM+MM 18/31 25/24 0.22 Recessive LL+LM vs MM 37/12 49/0 0.00023

PON2 C311S (rs6954345) Genotypic * CC vs CS vs SS 28/17/4 26/21/2 0.55 1 Cochran-Armitage C vs S (trend) 73/25 73/25 1 1

PON2 g10045a>g (rs11981433) Genotypic * aa vs ag vs gg 16/23/10 17/20/12 0.87 1 Cochran-Armitage a vs g (trend) 55/43 54/44 0.89 1

*, genotypic Fisher's exact test; †, Bonferroni correction for four markers.

24 INTERACTION BETWEEN PON1 AND POPULATION DENSITY IN ALS

(Broad Institute, http://www.broad.mit.edu/mpg/haploview) according to the default confidence interval method.19 Kaplan-Meier survival curves were estimated for the SNP with the strongest gene-environment association. A log-rank test was used to compare the survival curves.

RESULTS PATIENTS 2 Ninety-eight patients were studied. Patient characteristics of the current study population were not signifi- cantly different from the non-genotyped South East England ALS registry population (Table 1).

GENOTYPES The genotyping call rate was 100% for each of the four selected SNPs. All SNPs were in Hardy-Weinberg equilibrium.

Table 3 Estimated haplotype frequencies and association results

Frequency, Overall frequency, p Haplotype rural urban % Q192R – L55M Q – L 35.9 37.0 36.5 0.87 Q – M 42.7 24.2 33.4 0.0062 R – L 20.2 38.5 29.4 0.0049 The R-M haplotype did not occur in our study population.

GENE-ENVIRONMENT INTERACTION There was a significant association with population density for L55M (rs854560) at the genotypic level (genotypic p=0.00048, Cochran-Armitage p=0.0047). This association withstood Bonferroni correction for four markers (Table 2). We tested other models of association for this SNP and found the most signifi- cant association was for the recessive model (p=0.00023, table 2). None of those homozygous for the T allele were resident in an urban environment and the odds ratio cannot therefore be calculated. There was a weaker signal from Q192R (rs662) (genotypic p=0.043, Cochran-Armitage p=0.0099, table 2). Also, there was strong linkage disequilibrium between these two SNPs (D’=0.93). The D’ value for the PON2 SNPs rs6954345 and rs11981433 was 0.87. Two haploblocks were formed; one for the PON1 SNPs and another for the PON2 SNPs. The strong- est association was for the Q192R L55M RL haplotype (p=0.0049), which was underrepresented in rural residents (Table 3). Overall, haplotype analysis did not improve the single-marker association.

SURVIVAL In the rural environment, when comparing the three L55M genotypes (LL vs LM vs MM), survival signifi- cantly decreased with an increasing number of M alleles (log rank p=0.025, figure 1a).

25 Figure 1 Kaplan-Meier survival curves for L55M (rs854560).

In those with rural residence, LL genotype was associated with longer survival, whereas in those with urban residence there was no difference in survival. The urban group contained no one with MM genotype.

By contrast, in the urban environment, there was no significant difference in survival between the L and M alleles (log rank p=0.56, figure 1b).

DISCUSSION This study investigated gene-environment interactions between paraoxonase gene polymorphisms and population density in ALS patients, using population density as a proxy for rural versus urban regions and therefore presumed differential exposures. We found a significant association for SNP L55M (rs854560). There was a higher M allele frequency in ALS patients with rural residence, and strikingly, no individuals with MM in the urban group. Also, in rural areas, the M allele was associated with shorter survival. The L55M polymorphism is one of the two common coding variants in PON1; the other being Q192R. Paraoxonase 1 hydrolase activity is modified by the Q192R polymorphism, with different activity levels for different substrates. The Q variant favors hydrolysis of diazoxon, soman and sarin, while the R isoform is more effective at hydrolyzing paraoxon.14 The L55M polymorphism also appears to be substrate specific, for example, the 55LM and 55MM genotypes exert a higher protection from lipid peroxidation in low and high density lipoproteins than the 55LL genotype.20 L55M has been reported to affect paraox- onase 1 activity by modifying plasma expression levels. Lower plasma mRNA, paraoxonase 1 concentra- tions and diazoxonase activities were found in individuals carrying the M variant compared to those with the L variant.21,22 It is possible that the decreased paraoxonase 1 expression levels are not due to the L55M amino-acid substitution, but rather due to linkage disequilibrium with the PON1 promoter polymorphism −108c>t (rs705379).23 In the present study we were not able to test for linkage disequilibrium between the -108c>t promoter SNP and L55M because -108c>t had not been typed.

26 INTERACTION BETWEEN PON1 AND POPULATION DENSITY IN ALS

Overall, the L55M genotype frequencies in our sample were in Hardy-Weinberg equilibrium. However, when stratified by rural or urban residence, a relative increase of MM homozygosity was found in the rural group, while none of the ALS patients in an urban environment carried the MM genotype. This finding suggests that in the rural environment the MM genotype predisposes to the development of ALS. A weakness of this study is the use of population density data as a proxy for increased pesticide exposure within the rural environment. There are many substances that may differ between low and high population 2 density environments and it is therefore difficult to point out a single factor to explain our findings. However, this study indicates a functional variant mapped by the M allele of rs854560 may affect the risk of ALS in the context of an environmental exposure. A further weakness of this study is that a value for population density was assigned to each patient based on postcode data collected at time of diagnosis. This might not account for the total duration a pa- tient lived in a certain area. It would, however, be reasonable to expect that this is a valid approximation for the environmental exposure in the immediate years around the onset of symptoms that might account for disease triggering. As noted above, the current study population was derived from a population-based study of ALS patients in the South East of England. Population-based studies can minimize selection bias and maximize generalizability of findings.24 We realize that the individuals included in this study formed a selection from the full South East England ALS registry. However, since major patient characteristics did not differ signif- icantly from the original population, we consider the current selection a representative part of the whole South East England ALS registry. A case-only design was used to assess gene-environment interaction. This approach can achieve greater precision in estimating interactions than the more conventional case-control method.24,25 It is there- fore the most suitable approach when dealing with small study populations. However, case-only studies rely on two assumptions. The genotype must be independent from the environmental exposure and the disease should be rare so that the probability of undetected disease amongst “controls” is low.24 There is, of course, no reason to believe that variations in paraoxonase genes (responsible for pesticide and low density lipo- protein metabolism) would determine someone’s area of residence, but this possibility cannot be excluded. Furthermore, the crude incidence of ALS within the South East England ALS registry was estimated to be 1.06 per 100,000 person years,15 therefore, the probability of hidden cases amongst the control popula- tion is low. A drawback of the case-only design is that although the interaction of gene and environment can be measured, the effects of gene or environment separately cannot be determined.

CONCLUSION In this study we have found evidence for a gene-environment interaction of the L55M paraoxonase variant in ALS with population density. This is consistent with our prior hypothesis of an interaction with different pollutants in urban and rural environments but requires replication in studies using a larger population and formal measures of pesticide exposure. The measurement of additional parameters such as serum paraoxonase 1 activity would provide useful information as to the possible causes of the reported gene- environment interaction.

27 REFERENCES 1. Scarmeas N, Shih T, Stern Y, Ottman R, Rowland LP. Premorbid weight, body mass, and varsity athletics in ALS. Neurology 2002;59:773-775. 2. Chiò A, Benzi G, Dossena M, Mutani R, Mora G. Severely increased risk of amyotrophic lateral sclerosis among Italian professional football players. Brain 2005;128:472-476. 3. Nelson LM, McGuire V, Longstreth WT, Matkin C. Population-based case-control study of amyotrophic lateral sclerosis in western Washington State. Cigarette smoking and alcohol consumption. Am J Epidemiol 2000;151:156-163. 4. Kamel F, Umbach DM, Munsat TL, Shefner JM, Hu H, Sandler DP. Lead exposure and amyotrophic lateral sclerosis. Epidemiology 2002;13:311-319. 5. Holloway SM, Emery AE. The epidemiology of motor neuron disease in Scotland. Muscle Nerve 1982;5:131- 133. 6. McGuire V, Longstreth WT, Nelson LM, Koepsell TD, Checkoway H, Morgan MS et al. Occupational exposures and amyotrophic lateral sclerosis. A population-based case-control study. Am J Epidemiol 1997;145:1076-1088. 7. Haley RW. Excess incidence of ALS in young Gulf War veterans. Neurology 2003;61:750-756. 8. Slowik A, Tomik B, Wolkow PP, Partyka D, Turaj W, Malecki MT et al. Paraoxonase gene polymorphisms and sporadic ALS. Neurology 2006;67:766-770. 9. Mackness B, Durrington P, Mackness M. The paraoxonase gene family and coronary heart disease. Curr Opin Lipidol 2002;13:357-362. 10. Cronin S, Greenway MJ, Prehn JH, Hardiman O. Paraoxonase promoter and intronic variants modify risk of sporadic amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatr 2007;78:984-986. 11. Morahan JM, Yu B, Trent RJ, Pamphlett R. A gene-environment study of the paraoxonase 1 gene and pesticides in amyotrophic lateral sclerosis. Neurotoxicology 2007;28:532-540. 12. Saeed M, Siddique N, Hung WY, Usacheva E, Liu E, Sufit RL et al. Paraoxonase cluster polymorphisms are a ssociated with sporadic ALS. Neurology 2006;67:771-776. 13. Adkins S, Gan KN, Mody M, La Du BN. Molecular basis for the polymorphic forms of human serum paraox- onase/arylesterase: glutamine or arginine at position 191, for the respective A or B allozymes. Am J Hum Genet 1993;52:598-608. 14. Davies HG, Richter RJ, Keifer M, Broomfield CA, Sowalla J, Furlong CE. The effect of the human serum paraoxonase polymorphism is reversed with diazoxon, soman and sarin. Nat Genet 1996;14:334-336. 15. Abhinav K, Stanton B, Johnston C, Hardstaff J, Orrell RW, Howard R et al. Amyotrophic lateral sclerosis in South-East England: a population-based study. The South-East England register for amyotrophic lateral sclerosis (SEALS Registry). Neuroepidemiology 2007;29:44-48. 16. Kasperaviciute D, Weale ME, Shianna KV, Banks GT, Simpson CL, Hansen VK et al. Large-scale pathways-based association study in amyotrophic lateral sclerosis. Brain 2007;130:2292-2301. 17. Office of National Statistics. Population density, 2002 Regional trends [Internet]. Islington (United Kingdom): ONS. [cited 2008 February 19] Available from: http://www.statistics.gov.uk/StatBase/ssdataset.as p?vlnk=7662&More=Y

28 INTERACTION BETWEEN PON1 AND POPULATION DENSITY IN ALS

18. Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 2005;76:887-893. 19. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B et al. The structure of haplotype blocks in the . Science 2002;296:2225-2229. 20. Cherki M, Berrougui H, Isabelle M, Cloutier M, Koumbadinga GA, Khalil A. Effect of PON1 polymorphism on HDL antioxidant potential is blunted with aging. Experiment Gerontol 2007;42:815-824. 2 21. Brophy VH, Jarvik GP, Richter RJ, Rozek LS, Schellenberg GD, Furlong CE. Analysis of paraoxonase (PON1) L55M status requires both genotype and phenotype. Pharmacogenetics 2000;10:453-460. 22. O’Leary KA, Edwards RJ, Town MM, Boobis AR. Genetic and other sources of variation in the activity of serum paraoxonase/diazoxonase in humans: consequences for risk from exposure to diazinon. Pharmaco- genet Genomics 2005;15:51-60. 23. Brophy VH, Jampsa RL, Clendenning JB, McKinstry LA, Jarvik GP, Furlong CE. Effects of 5’ regulatory- region polymorphisms on paraoxonase-gene (PON1) expression. Am J Hum Genet 2001;68:1428- 1436. 24. Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene-environment interaction: case-control studies with no controls! Am J Epidemiol 1996;144:207-213. 25. Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med 1994;13:153-162.

29

3

A CASE OF ALS-FTD IN A LARGE FALS PEDIGREE WITH A K17I ANG MUTATION

NEUROLOGY. 2009;72(3):287-8

Michael A van Es, Frank P Diekstra, Jan H Veldink, Frank Baas, Pierre R Bourque, Helenius J Schelhaas, Eric Strengman, Eric AM Hennekam, Dick Lindhout, Roel A Ophoff, Leonard H van den Berg INTRODUCTION Approximately 90% of amyotrophic lateral sclerosis (ALS) cases are sporadic (SALS), but 10% are familial (FALS). Mutations in SOD1, Alsin, Dynactin, SETX, DJ-1, VAPB, and TDP-431 have been reported (table e-1). After the identification of sequence variation VEGF in patients with ALS, mutations in another angiogenic gene (ANG) were identified in SALS and FALS.2,3 Studies in other populations have identified ANG mutations in patients with ALS, but also in healthy controls. This suggests that not all mutations are pathogenic.3,4

METHODS A total of 39 unrelated FALS patients, negative for SOD1 mutations, were screened for ANG mutations. This study was approved by the local ethics committee and participants provided informed consent. DNA was isolated from venous blood and ANG mutation analysis was performed as described in appendix e-1. A total of 275 unrelated, healthy controls were taken from a prospective population-based study on ALS in The Netherlands and were also screened.5 PMut (http://mmb2.pcb.ub.es:8080/PMut/) was used to predict the impact of an amino acid substitution on the structure and function of the protein.

RESULTS We identified one mutation in one patient (122 A>T) (figure, A), leading to an amino acid substitution of lysine to isoleucine (K17I) (figure, B). PMut analysis predicted this mutation to be pathogenic. Sequence alignments of ANG in different species demonstrated high conservation (figure, C). Analysis of this pedigree revealed an autosomal dominant inheritance of the mutation (male to male transmission) (figure, D). DNA was available from 44 out of 62 family members (five affected individuals). All affected family members carried the K17I mutation. Ten carriers were identified, but all were under 50 years of age (except one who was 75 years old without symptoms or signs of ALS). The K17I mutation was not found in 275 control samples. Cases III-3, III-4, and IV-1 all presented with progressive upper and lower motor neuron loss of limbs. Case III-1 rapidly developed weakness in both arms with atrophy, fasciculations, and dyspnea, but no upper motor neuron signs. The patient died after 6 months from onset. Case III-2 initially presented with parkin- sonism (bradykinesia, diminished postural reflexes, cogwheel rigidity [right arm], shuffling, short-stepped gait, and decreased spontaneous eye blink rate). There was no autonomic dysfunction and eye movements were intact. Dopaminergic treatment had little effect. After 5 years, the patient developed progressive weakness of the arms and legs with atrophy, fasciculations, and hyperreflexia. Interestingly, the patient also demonstrated symptoms characteristic of frontotemporal dementia (FTD), such as loss of interest in social contacts and family, short attention span, logopenia, verbal apraxia, perseveration, decreased personal hy- giene, hyperorality, reckless behavior in traffic, sexual disinhibition, and apathy. Case I-2 and II-4 also appear to have been affected. However, no medical records were available. Patient I-2 developed limb weakness at age 70, leading to paralysis and death within 3 years. Patient II-4 developed speech impairment at age

32 A CASE OF ALS-FTD IN A LARGE FALS PEDIGREE WITH A K17I ANG MUTATION

60 and also died within 3 years. Patient II-2 (obligate carrier) died at age 50 from cardiovascular disease. Detailed clinical characteristics are provided in table e-2.

DISCUSSION Several ANG mutations in FALS have been reported, but clear segregation of mutations with the disease has not been shown. Here, we report the K17I mutation segregating with disease in a large pedigree. The fact that II-2 and a carrier (75 years of age) were without symptoms of ALS suggests incomplete pene- trance of the mutation. This might explain why mutations in this codon have only been found in SALS. The K17I mutation was previously reported in three cases and K17E in two cases.3,6 3

Figure Mutational analysis and partial pedigree

33 This study provides a report of a patient with an ANG mutation and ALS, FTD, and parkinsonism. Five percent of patients with ALS also have FTD and up to 50% demonstrated mild cognitive impairment. Similarly, relatives of patients with ALS have an increased risk for developing PD. Therefore, genes involved in ALS are also considered candidate genes for other neurodegenerative disorders. Indeed, an Italian study reported a SALS patient with a 132C>T mutation and frontal lobe dysfunction.4 ANG is highly conserved between species, suggesting it has an important biologic function. Modeling of the K17I mutation using PMut predicted this to be pathogenic. Two functional studies demonstrated that the K17I mutation results in loss of function, possibly leading to insufficient ribosomes synthesis, decreased protein translation, and ultimately decreased motor neuron viability.6,7 We report segregation of the K17I mutation with FALS and a patient with FALS, FTD, and parkinson- ism, which possibly implicates ANG in these diseases.

34 A CASE OF ALS-FTD IN A LARGE FALS PEDIGREE WITH A K17I ANG MUTATION

REFERENCES 1. Valdmanis PN, Rouleau GA. Genetics of familial amyotrophic lateral sclerosis. Neurology 2008;70:144-152. 2. Greenway MJ, Alexander MD, Ennis S, et al. A novel candidate region for ALS on chromosome 14q11.2. Neurology 2004;63:1936-1938. 3. Greenway MJ, Andersen PM, Russ C, et al. ANG mutations segregate with familial and ‘sporadic’ amyotrophic lateral sclerosis. Nat Genet 2006;38:411-413. 4. Gellera C, Colombrita C, Ticozzi N, et al. Identification of new ANG gene mutations in a large cohort of Italian patients with amyotrophic lateral sclerosis. Neurogenetics 2008;9:33-40. 5. van Es MA, Van Vught PW, Blauw HM, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic 3 lateral sclerosis: a genome-wide association study. Lancet Neurol 2007; 6:869-877. 6. Wu D, Yu W, Kishikawa H, et al. Angiogenin loss-of-function mutations in amyotrophic lateral sclerosis. Ann Neurol 2007;62:609-617. 7. Crabtree B, Thiyagarajan N, Prior SH, et al. Characterization of human angiogenin variants implicated in amyotrophic lateral sclerosis. Biochemistry 2007; 46:11810-11818.

SUPPLEMENTARY INFORMATION APPENDIX E-1 Mutations Analysis: DNA was amplified with PCR using primers: ANG_xn2_For, 5’ TGTTCTTGGGTCTACCACAC; ANG_xn2_Rev, 5’ AATGGAAGGCAAGGACAGC. Forward and reverse strands were sequenced with the same primers. Se- quence reaction products were purified using Sephadex (GE Healthcare) columns and run on an ABI 3730 automated sequencer. Traces were analyzed using ContigExpress from the Vector NTI Suite 10 (Invitro- gen). All mutations were confirmed using stock DNA samples.

35 Table e-1 Mutations in Familial ALS

Gene Chromosome Inheritance Clinical Features SOD11 21q22 AD Typical ALS ALSin2 2q33 AR Juvenile onset, slowly progressive, predominantly corticospinal VAPB3 20q13 AD Typical ALS Dynactin4 2p13 AD Adult onset, slowly progressive, early vocal cord paralysis DJ-15 22q13.2 AR Parkinson's disesase, ALS-FTDP SETX6 9q34 AD Adult onset, slowly progressive ANG7 14q11 AD Typical ALS TDP-438 1p26 AD Typical ALS

Table e-2 Clinical findings of the family with ALS and ANG K17I mutation

Disease Subject Age at Site of Respiratory Upper motor Lower motor duration ALS Plus no.: Onset (yrs) Onset Involvement neuron signs neuron signs (mo) I:2* ± 70 Limb Yes na na ± 36 † - II:4* ± 60 Bulbar Yes na na ± 36 † - III:1 61 Limb Yes No Yes 6 † - III:2 70 Limb Yes Yes Yes 42 † - III:3 72 Limb Yes Yes Yes 24 FTD & PD III:4 68 Limb Yes Yes Yes 34 † - IV:1 55 Limb Yes Yes Yes 26 - Subject numbers correspond to the numbers in the pedigree in Fig 1c. Age of onset is calculated from initial onset of weakness. Disease duration is calculated from initial manifestation of weakness until death or last date of contact. †: Individual is deceased. *No medical records were available for I:2 and II:4. Information was collected from family members

36 A CASE OF ALS-FTD IN A LARGE FALS PEDIGREE WITH A K17I ANG MUTATION

SUPPLEMENTARY REFERENCES 1. Rosen DR, Siddique T, Patterson D, et al. Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature 1993;362(6415):59-62. 2. Hadano S, Hand CK, Osuga H, et al. A gene encoding a putative GTPase regulator is mutated in familial amyotrophic lateral sclerosis 2. Nat Genet 2001;29(2):166-173. 3. Nishimura AL, Mitne-Neto M, Silva HC, et al. A mutation in the vesicle-trafficking protein VAPB causes late-onset spinal muscular atrophy and amyotrophic lateral sclerosis. Am J Hum Genet 2004;75(5):822-831. 4. Munch C, Sedlmeier R, Meyer T, et al. Point mutations of the p150 subunit of dynactin (DCTN1) gene in ALS. Neurology 2004;63(4):724-726. 3 5. Annesi G, Savettieri G, Pugliese P, et al. DJ-1 mutations and parkinsonism-dementia-amyotrophic lateral sclerosis complex. Ann Neurol 2005;58(5):803-807. 6. Chen YZ, Bennett CL, Huynh HM, et al. DNA/RNA helicase gene mutations in a form of juvenile amyotrophic lateral sclerosis (ALS4). Am J Hum Genet 2004;74(6):1128-1135. 7. Greenway MJ, Alexander MD, Ennis S, et al. A novel candidate region for ALS on chromosome 14q11.2. Neurology 2004;63(10):1936-1938. 8. Sreedharan J, Blair IP, Tripathi VB, et al. TDP-43 mutations in familial and sporadic amyotrophic lateral sclerosis. Science 2008;319(5870):1668-1672

37

4

MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

PLOS ONE. 2012;7(4):E35333

Frank P Diekstra,* Christiaan GJ Saris,* Wouter van Rheenen, Lude Franke, Ritsert C Jansen, Michael A van Es, Paul WJ van Vught, Hylke M Blauw, Ewout JN Groen, Steve Horvath, Karol Estrada, Fernando Rivadeneira, Albert Hofman, Andre G Uitterlinden, Wim Robberecht, Peter M Andersen, Judith Melki, Vincent Meininger, Orla Hardiman, John E Landers, Robert H Brown Jr, Aleksey Shatunov, Christopher E Shaw, P Nigel Leigh, Ammar Al-Chalabi, Roel A Ophoff, Leonard H van den Berg*, Jan H Veldink*

* These authors contributed equally to this work ABSTRACT Amyotrophic lateral sclerosis (ALS) is a progressive, neurodegenerative disease characterized by loss of upper and lower motor neurons. ALS is considered to be a complex trait and genome-wide association studies (GWAS) have implicated a few susceptibility loci. However, many more causal loci remain to be discovered. Since it has been shown that genetic variants associated with complex traits are more likely to be eQTLs than frequency-matched variants from GWAS platforms, we con- ducted a two-stage genome-wide screening for eQTLs associated with ALS. In addition, we applied an eQTL analysis to finemap association loci. Expression profiles using peripheral blood of 323 sporadic ALS patients and 413 controls were mapped to genome-wide genotyping data. Subsequently, data from a two-stage GWAS (3,568 pa- tients and 10,163 controls) were used to prioritize eQTLs identified in the first stage (162 ALS, 207 controls). These prioritized eQTLs were carried forward to the second sample with both gene- expression and genotyping data (161 ALS, 206 controls). Replicated eQTL SNPs were then tested for association in the second-stage GWAS data to find SNPs associated with disease, that survived correction for multiple testing. We thus identified twelve cis eQTLs with nominally significant associations in the second-stage GWAS data. Eight SNP-transcript pairs of highest significance (lowest p=1.27×10−51) withstood multiple-testing correction in the second stage and modulated CYP27A1 gene expression. Addition- ally, we show that C9orf72 appears to be the only gene in the 9p21.2 locus that is regulated in cis, showing the potential of this approach in identifying causative genes in association loci in ALS. This study has identified candidate genes for sporadic ALS, most notablyCYP27A1 . Mutations in CYP27A1 are causal to cerebrotendinous xanthomatosis, which can present as a clinical mimic of ALS with progressive upper motor neuron loss, making it a plausible susceptibility gene for ALS.

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease characterized by progressive muscle weakness caused by loss of central and peripheral motor neurons. Symptoms typically have a localized limb or bulbar onset and progress to other muscle groups of the body. Denervation of respiratory muscles and dysphagia leading to respiratory complications are the most common causes of death. There is no cure for this rapidly progressive disease. Approximately 5% of patients have a family history of ALS.1 All other cases are considered to have a sporadic form of the disease. ALS is considered to be a disease of complex etiology with both genetic and environmental factors contributing to disease susceptibility.2 These genetic factors are the subject of extensive research.3 Multiple genome-wide association studies (GWAS) and candidate gene studies have been carried out, implicating several genes in the susceptibility to ALS,4-8 but attempts to replicate most of these genes have proven difficult.9-13 Recently, our group has published a GWAS comprising over 4,800

40 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS patients and nearly 15,000 controls and identifying UNC13A and 9p21.2 as susceptibility loci for sporadic ALS.7 The 9p21.2 locus was recently replicated in an independent set of British patients and controls12 and also shown to be strongly associated with ALS in Finland.14 This locus was previously found to be one of the linked loci in families with ALS and frontotemporal dementia (FTD), and it was recently shown that a hexanucleotide repeat expansion in C9orf72 was the basis of this linkage signal.15,16 Despite these large study samples, GWAS have been able to explain only little of the genetic variation in ALS.4-7 An important drawback of GWAS is the burden of multiple-testing correction, requiring even larger sample sizes in order to be able to detect small effects. It is common practice to apply a strict Bonferroni correction to GWAS data. With so many tests, there is a high false-negative rate, as true associations are hidden in the fog of random associations. It has been established that gene expression levels can be mapped to genomic variation as a quan- titative trait in order to detect so-called expression quantitative trait loci (eQTLs).17-19 Recently, it has been shown that trait-associated SNPs are more likely to be eQTLs20, making the systematic analysis of eQTLs in 4 the context of a GWAS a promising tool for the discovery of novel disease-causing genes. In addition, eQTLs can have local and distant effects, allowing for the identification of parts of biological networks related to disease. These networks might be the link between several different genetic variants that appear to be as- sociated with a disease in a GWAS.19 In practical terms, in order to identify eQTLs associated with disease, both genome-wide genotype data as well as genome-wide gene expression levels have to be collected. The focused genetic mapping of gene expression levels has frequently been applied to the fine-mapping of risk loci resulting from GWAS, for example in the study of asthma21 and Crohn’s disease.22 Furthermore, genome-wide eQTL analysis has proven fruitful in the study of diseases including obesity23, hypercholes- terolemia24, celiac disease25, and late-onset Alzheimer disease.26 In the present study, we have performed a genome-wide screen for eQTLs associated with susceptibility to ALS.

Figure 1 Study design

41 A schematic overview of our study design is shown in Figure 1. We performed an initial screen for eQTLs in an eQTL discovery set. The eQTL SNPs resulting from this screen that had a nominally significant effect in a discovery set from our previously published GWAS7 were selected for follow-up in the eQTL replication set. Ultimately, replicated eQTLs were tested for significant effects in the GWAS replication data, correcting for multiple testing.

METHODS ETHICS STATEMENT All participants gave written informed consent and approval was obtained from the Institutional Review Board of the University Medical Center Utrecht. The present study was conducted according to the princi- ples expressed in the Declaration of Helsinki.

GWAS DATA Genome-wide genotype data were derived from a previously published GWAS of sporadic ALS in seven countries (The Netherlands, Belgium, France, Ireland, United Kingdom, Sweden, United States).7 All patients fulfilled the 1994 El Escorial criteria for probable or definite ALS.27 Cohorts for which genome-wide SNP data were available were included. For both the discovery and replication set, genotype files with Illumina Beadchip data (HumanHap 300K, HumanCNV 370K, HumanHap 550K or HumanHap 610K platforms) were merged and the following quality control measures were taken. Only SNPs common to all cohorts were used. Triallelic and C/G or A/T SNPs were excluded. Genotype files were merged, and after each merge, a flipscan (scan for possible allele swaps) was performed in PLINK v1.07.28 SNPs with call rate <95%, minor allele frequency <5%, deviation from Hardy-Weinberg equilibrium in controls (p<1×10−4), or with differing heterozygosity or missing rates between cases and controls were excluded. Duplicate samples, samples with a genotyping rate <95%, samples without gender information, or samples where the genotypic gender did not match the phenotype file gender were excluded. LD-based SNP pruning was used to determine a subset of SNPs in approximate linkage equilibrium. This subset of SNPs was used to identify related samples, which were subsequently removed (pi-hat >0.2). The software package EIGEN- STRAT was used to detect population substructure by principal components analysis.29 HapMap phase III release 2 genotypes were added into this analysis in order to determine population outliers. After removal of population outliers, new principal components were calculated. More detailed data on included subjects, genotyping methods, and quality control are available in Text S1 and Table S5.

EXPRESSION DATA Genome-wide gene expression data were obtained from 805 Dutch individuals (357 patients and 448 controls), who were also genotyped on either the HumanHap 300K, HumanCNV 370K or HumanHap 550K platforms in the previously described GWAS.7 Patients were recruited at our referral clinic for mo-

42 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS tor neuron disease at the University Medical Center Utrecht, The Netherlands. Included patients were diagnosed with probable or definite sporadic ALS according to the 1994 El Escorial criteria.27 Messenger RNA was collected and extracted from peripheral whole blood using PAXgene tubes and PAXgene extrac- tion kit (Qiagen). Samples were hybridized to Illumina HumanHT-12v3 Expression BeadChips. Case and control samples were randomly assigned to the chips and all chips were run in one batch. Before quality control, expression levels were available for 48,803 probes. Raw expression data were quantile normalized

30 and log2 transformed in R (2009, The R Foundation for Statistical Computing). Using principal compo- nents analysis of expression data, outlier arrays were detected. Non-pseudoautosomal Y chromosome transcript expression levels were used for a gender check. Outlier arrays, samples with inconsistent gender information, and samples designated as duplicates in our GWAS data, were removed from the raw data (n=67). Also, non-autosomal probes were excluded (n=2,002). The thus obtained trimmed raw dataset was again quantile normalized and log2 transformed. All probe sequences were aligned to the NCBI build 36 reference genome using UCSC’s Genome Browser function BLAT.31 Non-specific probes, defined as 4 no or multiple hits with a >95%, were removed (n=7,234). RefSeq (updated on 27 September 2010) and UniGene (build #228, release date 29 October 2010) databases were used to determine probes mapping to transcripts designated as retired and these probes were excluded as well (n=2,449), leaving 37,118 gene-expression probes.

EQTL DATASETS For the genetic mapping of gene expression, the subset of Dutch individuals with both genome-wide gen- otype and expression data was tested for population substructure by principal components analysis of genomic data using EIGENSTRAT.29 By inspecting the first two principal components, two outlier samples (one case, one control) were identified and excluded. Subsequently, new principal components were cal- culated. Non-autosomal SNPs were removed from the eQTL analysis. We randomly split our expression dataset to form equally sized discovery and replication sets (Table S1).

STATISTICAL ANALYSIS For the GWAS data, association with disease was tested in a logistic model using gender, dummy-coded na- tionality and the first eight principal components in order to correct for ancestry as covariates. To determine the number of principal components to be included in the logistic regression model, the first ten principal components from the EIGENSTRAT29 analysis were tested for association with case/control status (thresh- old p<0.05). For the GWAS discovery set, eight principal components were included in the logistic model, while for the GWAS replication set two principal components were included. Analyses were performed in PLINK v1.0728 and R (2009, The R Foundation for Statistical Computing). For all analyses involving expression data, Surrogate Variable Analysis (SVA) was used to account for heterogeneity in gene expression due to known and unknown environmental, technical or demographic

43 factors.32 SVA captures these factors into covariates for use in statistical models. Additionally, ‘riluzole use’ status was obtained, the only drug available to ALS patients with proven effect on survival. For the eQTL analyses, SNP genotypes coded as an additive genetic model were tested for associa- tion with gene expression by linear regression using disease status, age, gender, surrogate variables (18 in the discovery set and 19 in the replication) and riluzole use as covariates. Cis eQTLs were defined as SNPs modulating transcript expression levels within a region of 1Mb surrounding a probe’s genomic midpoint.26 False-positive cis effects may, however, occur due to SNPs that are located within a transcript probe or that are in linkage disequilibrium (LD) with SNPs mapping within a transcript probe.33 We used the Broad Institute SNAP tool v2.234 to determine pairwise LD between cis effect SNPs and SNPs mapping to a transcript probe in either of the HapMap phase III release 2 or 1000 Genomes Pilot 1 CEU panels. 21,863 SNP-transcript combinations (pairwise LD threshold r2 >0.2) were excluded from analysis. Similarly, we removed 24,170 SNP-transcript combinations with an InDel overlapping with a transcript probe, according to the Database of Genomic Variants (version 10, November 2010).35 There were 3,541,781 possible SNP-transcript combinations in cis left for analysis. The number of possible combinations in cis was used for Benjamini-Hochberg false discovery rate (FDR) calculations. Significantcis effects were those SNP-tran- script pairs that had significant p values at an FDR of 5% after 10,000 permutations. Permutations were performed swapping case/controls labels so that each subject is assigned the genotype vector of another random subject, while the expression matrix is unchanged. This prevents the underestimation of the null distribution, thereby preventing the detection of false-positive eQTLs, as described previously.36 Analyses were performed in PLINK28 and R (2009, The R Foundation for Statistical Computing).

EQTL SELECTION In order to link the identified eQTLs to disease, we made a selection of significantcis effects in the eQTL discovery set. Recent studies on the genetics of gene expression have shown that disease-associated loci from GWAS are greatly enriched for eQTLs.20,25 Thus, we selected SNP-transcript pairs that had a nominal SNP p value <0.05 in our GWAS discovery data (Figure 1). Only these SNP-transcript pairs were used for follow-up in the replication data. Patient character- istics for the expression replication dataset are presented in Table S1. SNP genotypes were correlated to gene expression levels following a similar statistical analysis as used for our discovery set. Again, a 5% FDR significance threshold was applied. Subsequently, association with ALS for SNPs from the replicat- ed cis SNP-transcript pairs was tested in the GWAS replication data by logistic regression using gender, dummy-coded nationality and the first two EIGENSTRAT principal components (these were significantly correlated to case/control status) as covariates. Association test results were clumped based on LD (r2 >0.5) using PLINK, so that SNP p values could be obtained for independent eQTLs. eQTLs with a replica- tion pGWAS <0.05 after Bonferroni correction for the number of independent (LD-based clumped) loci were considered to be significant (Figure 1).

44 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

RESULTS EQTL DISCOVERY After quality control, eQTL analyses were performed on 162 ALS cases and 207 controls in the eQTL dis- covery set with data on 261,682 autosomal SNPs and 37,118 expression probes. Patient characteristics are summarized in Table S1. At a Benjamini and Hochberg false discovery rate (FDR) of 5%, we detected 16,901 significant SNP-transcript pairs incis (Figure 1).

GWAS DISCOVERY In the GWAS discovery set, 2,261 ALS cases and 8,328 patients remained after quality control measures with genotypes for 268,952 SNPs. Details of included study populations are shown in Table S2. Association analysis resulted in one SNP (rs12608932 in gene UNC13A) with genome-wide significance (p=1.7×10−8) after Bonferroni correction for 268,952 SNPs. A Manhattan plot of genome-wide results is shown in Fig- ure S1. A quantile-quantile plot of disease association p values is provided in Figure S2 (genomic control 4 λ=1.03). There were 14,167 autosomal SNPs with a nominal p value <0.05. These SNPs were used to prioritize eQTLs found in the eQTL discovery set (Figure 1). From the eQTL discovery results, we selected the 1,108 SNP-transcript pairs (755 eQTL SNPs) in cis with discovery pGWAS <0.05 (Figure 1). To confirm the hypothesis that disease-associated SNPs are

20 more likely to be cis eQTLs , we searched for enrichment for eQTLs in our list of SNPs with pGWAS <0.05. We first determined the number ofcis eQTLs in the set of SNPs with pGWAS <0.05 (n=755). Then, we randomly selected a subset of 14,167 SNPs with pGWAS >0.05, matched for minor allele frequency to the set of SNPs with pGWAS <0.05 (in 5% frequency bins). Subsequently, we determined the number of eQTLs present in each of these sets of SNPs, using 100,000 permutations. By determining how often more than the initial number of eQTLs were observed, we showed that there was evidence for enrichment for eQTLs in the set of disease-associated SNPs (empirical p=0.003).

EQTL REPLICATION The eQTL replication set comprised 161 ALS patients and 206 control samples (Table S1). 951 out of 1,108 selected SNP-transcript pairs in cis were significantly replicated (Figure 1). The eQTL SNPs of these SNP-transcript pairs were selected for replication in the GWAS replication data.

GWAS REPLICATION After quality control, there were 1,307 ALS cases and 1,835 controls in the GWAS replication set with genotypes for 266,492 SNPs (Table S2). 577 cis eQTL SNPs were tested for association in the GWAS replication data. Using linkage disequilibrium-based clumping of association results28, 322 independent clumps could be formed. This number of clumps was used for Bonferroni correction, as these clumps des- ignate independent loci. Table 1 shows clumps with a nominal pGWAS <0.05 in the replication set. Ultimately,

45 Results for fine-mapping of loci previously associated with ALS. Table 2 eQTLs with a nominally significant GWAS p value in the replication data Table 1 C L CDK 5 C H S C K Z RA BE P M SP I TT C S C isoform a C9orf72 Locus Chr. 9 Chr. oc u L L I Y L 1 E N s Y AA 05 1 7 C C E . P N 44 773 7 BPC 3 F 1 39 C o C 39 A 11 A 27A 1 , 58 6 p m g I

s P rf 75 , The minor allele of rs10122902 associated was with increased rs3849942 association and SNP results in the joint data based were on a total GWAS of 3,568 patients ALS and 10,163 controls. OR, OR, data from both discovery and replication eQTL datasets combined. ndependen t e 1

i R

V

i v 2 bo n no r a 1 1 en f o 1

A

1

3

odds ratio; eQTL, expression quantitative trait locus.

,

f

., ll e r Bo n

t l h e e

2 1 1 5 1

C 1 1 1 1 2 1 1 f ILMN_1741881 probe identifier HT Illumina 2 8 7 7 9 7 6 1 w

e

h

t

rr o e a r Q

s s T ti n

n a L i ss oc i s g

c

I I I

I I I I I I I I I I p Ill u I I I a o L L L L L L L L L L L L L L L L o r r MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ MN _ f 322 rr e e obe m a -

12v3 b t i c ed na a

t 1 1 1 2292 1 1 1 2 23 7 1 1 1 1 1 1 1 1 s i ed 7 663 1 89696 7 7 7 7 69323 3 696463 , 7 7 7 7 7 de n e 11 6 c

w 0 35 7 46 7 4 8 29 1 9 1 3 H d 962 2 l 11 65 , 7 0 11 8 u 498 5 i T p 22 0

7 t 1 m on 92 8 ti f - h 55 ,

1 4 42 , 2 3 v 1 7 4

4 p 2 2 a i i 0 7 8

nc r e s

0

v

L l

rs1565948 rs10122902 SNP .

ue ; r 3

D

S

- ea s N b

n P

a . s s

e a ., ed d ss oc i no t rs 467434 5

rs 48 0 rs 1 rs 11 5422 7 rs 8 rs 11 26474 3 rs 7 rs 1 rs 2279 0 rs 47957 0 rs 386535 1 rs 1 C S

e N l S u

x 3354 0 0 0 P N 1

0 m p 262 1 s 5 49 11 0

a P 5674 2 i r p 0 gn i ti on e 1

5 c 574 5 ss i i ndex 1 l 1 0 f u 6 4 2 i 0 4

on m

can t G A

allele Minor r

1

e

p

s i l ng . u e

; lt s v

e e

Q F l i s o n TL , , G

A G G G A A A A A A G M a

r

' -

ll e t

C9orf72 i h e ' m ean s d no r

e e ach 0.32, 0.99 0.32, r 0.08, 1.00 0.08, rs3849942 with LD l 2 e

xp r , D' j

o C9orf72 i n l

e o t ss i cu s

G expression levels, while the minor allele of rs1565948 associated was with decreased expression. estimates LD with SNP on W ec r 1 O 0 0 1 1 0 1 1 1 G a 1 0 1 ,

......

0 0 1 0 1 1 1 1 AS s . . . .

t W R 9 9 9 9 7 6 2 2 1 h s q 8 8 8

ea s , chromosome 9 open reading frame 72; Chr., chromosome; LD, linkage disequilibrium; GWAS, genome

2 2 2 1 oc i

e AS uan tit a

d c a a e l d u t t d OR 0.97 1.14 association discovery GWAS SNP a i m i o

s

g co v n p w 0 p 0 0 0 7 0 0 1 2 2 0 3

ti v

ene .

. 9 e . .

.

...... 5 i 2 1 7 0 0 0 0 0 0 0 n e 5 r 1 e 1 9 6 4 2 3 4 3 4 2 e dex

r t 9 0 4 0 4 7 2

e y 1 r b 1 1

0

x p 3.17 0.49 1 a 1

0 0 a 0 0 S it

p s S

N 3 r 3 3 ed

3 l 3 N e

×

P ocu s

ss i P

10 o

( on . n − w .

4

it h a

C t

o h t 1 O 1 1 1 1 1 0 0 0 a G 1 0 1 h t ...... r 2 1 1 1 1 1 1 1 s a . . . . W e , R 8 8 8 8 7 5 2 4 9 5 5

s 3 l

ch r

oc i 9 9 8 8

OR 1.01 0.98 association replicationGWAS SNP AS r e l

o o

f 3 w o a

e m t

, i s 1 3 p 0 0 0 0 0 0 0 0 0 0 568 o p o . t 3 .

......

n 6 0 0 0 0 0 0 0 0 0 0 s li ca ti on p 2

o 4 1 3 2 4 2 4 1 2 3 2

1 1 m v 0 9 9 7 1 1 4 5 A

a 1 p 0.81 0.93

e L 1

0 l 0

S ; u

e L

4 p 3

D )

S a i ,

s N ti e li n

P s h n

p n n n 0 n n n n n n n n k o t a ......

. s s s s s s s s s s s s 0 bo n w g ......

. . . .

4 The expres an d e n 2

.

d f

F . i

o 1 0 s OR 0.97 1.11 association Joint SNP GWAS r e

q t ,

1 h u 63

e ili b O 0 1 1 1 a J 1 1 0 1 1 1 0 0

...... o 1 1 1 1 1 1 1 1 s G . . . . R 9 9 9 9 3 1 2 9 4 4 2 1 i c s sion explained variance (R

n r W

2 1 1

0

oc i

o i p

6.00 0.42

t u

n AS

m G t

a r W ; o t

i

G r × l o AS s ep li ca ti o W n 10 . p 3 8 4 4 1 1 2 3 5 2 2 2

. .

F 8 4 ......

. A 9 1 3 2 7 0 7 0 4 1 S − o 4 2 5 4 S 2 5 8 8 6 5 4 6 0 r N

,

P t g 1 1 1 1 h 1 1 1 1 1 1

0 0 1 0 0 1 eno m 0 0 e 0 0 0 0 0 0 n

4 e 4

4 3 3 4 3 5 3 5 3

r

Discovery permutations

5.00 1.39 eQTL 4

Q

e

s T e u L - × lt s w

× d

10 i p value after 10 , i de D

pe r 2 5 3 3 4 5 9 6 1 5 6 1 e r − . . B e 1 6 . Q ...... − i 2 7 0 4 4 2 4 7 3 0 7 5 s on f 2 c 5

5

0 3 a 9 ) estimated was from expression 4 7 8 T 3 co v 3 9 5

ti o m ss oc i

L 1 u

e 1 0

1 1 1 n 1 1 1 1 p 1 1 1 0 t e 0 rr on i 0 0 0 0 0 0 0 0 0 a

r o v ti o 5 y 4

a Replication 3.00 2.08 a 7 f e 5 9 8 2 7 6 2 5 7 6

7

6

t l

n

u

i

on s ff e e co rr ec t

-

× × a wide association study; s c f 10 10 t t t u R 6 9 9 1 1 6 2 3 5 1 1 7 , e

. . . . − − d . 5 1 1 5 ...... e r 4 0 8 5 8 4 2 6 9 9 4 '+' 4 y

1 7 p

9

4 e 5 0 0 1 5 9 ;

li ca t d

O 1 1 1 m 1

1 0 0 0 1 1 1 1 0 1 1 1 p R 0 0 0 0 0 0 0 0 ean s t he

,

v i 4 6 4 5 5 o odd s 7

4 a 1 4 4 4 4

3

5

0

n 0 variance explained Expression Combined dataCombined 0.80 0.80 (R

l

ue s

2 )

r

a a r SN P t e i o

+ o d e - + + + + + + + + + + ;

f Q i

r

e e T c ff ec t L ti o

n

46 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

Figure 2 Regional linkage disequilibrium (LD) near the CYP27A1 locus on chromosome 2

4

Top: the position of GWAS SNPs and RefSeq genes located within the regional LD block are drawn. On the X-axis, ge-

nomic position in kb, aligned to NCBI genome build 36 coordinates. On the left Y-axis, -log10(p values) for the strongest cis eQTL association for a gene in the replication data, the vertical position of genes (drawn as arrows) are aligned to this axis and thus represent statistical significance. For one gene (RQCD1), no SNP-transcript pair and, therefore,

no eQTL p value was available in our data. This gene is shown as a dashed arrow. On the right Y-axis, -log10(p values) from the replication GWAS analysis for SNPs within the region (black line), SNPs modulating CYP27A1 expression are shown as black dots, other SNPs are grey. Bottom: pairwise linkage disequilibrium for HapMap phase III release 2 SNPs (CEU+TSI populations). The LD plot was created in Haploview v4.250, using the standard D’/LOD color scheme.

we identified 1cis eQTL, comprising 8 SNP-transcript pairs, which was significantly replicated, and the transcript of which mapped to gene CYP27A1 (Figure 2). The results for this locus are listed in Table S3, also indicating that the explained variance of gene expression that is achieved by the linear models ranged from 48-65%. The relationships between the SNPs and gene-expression levels are shown in Figure S3.

FINE-MAPPING OF LOCI UNC13A AND CHROMOSOME 9p21.2 In addition to our genome-wide screen for eQTLs associated with sporadic ALS, we specifically examined possible relevant cis effects in two previously associated loci (geneUNC13A and chromosome 9p21.2).7,12 The detection of cis effects might fine-map these loci. For the UNC13A locus (SNP rs12608932), multi- ple-testing correction was applied for 41 possible SNP-transcript pairs in cis (as determined by a genomic

47 distance of <500kb between the SNP and a probe’s midpoint). One SNP-transcript pair had a nominal p value <0.05, the transcript of which mapped to gene PGLS (peQTL=0.01). However, when using a 5% Benja- mini-Hochberg FDR for the locus as multiple-testing correction, no SNP-transcript pairs reached statistical significance. For the chromosome 9p21.2 locus, we looked forcis eQTLs within a 130kb LD block compris- ing previously associated SNPs (rs2814707 and rs3849942). Multiple-testing correction for the testing of 328 SNP-transcript pairs was applied using a 5% FDR. Two SNP-transcript pairs reached the threshold for statistical significance and were associated with C9orf72 isoform a expression levels (Table 2 and Figure S4). SNP rs1565948 modulated C9orf72 gene expression in both eQTL discovery and replication sets and was associated with susceptibility to ALS in the joint GWAS data; however, no association with ALS was found in the GWAS replication set alone (Table 2).

DISCUSSION The present study reports the results of a large and comprehensive genome-wide screening of the genetics of gene expression in an attempt to find novel genetic variants that associate with sporadic ALS. We used a two-stage approach to minimize the chance of false-positive findings, both for eQTL discovery purposes and for the detection of novel SNP-ALS associations. eQTLs were used for prioritizing GWAS results, as it has been established that SNPs that are truly associated with disease are more likely to be eQTLs.20,25,37 In the present study, we show that the number of eQTLs is greater than expected by chance (p=0.003) among the SNPs with a nominal association with ALS, compared to frequency-matched SNPs, also indi- cating that eQTLs may be useful in the prioritization of GWAS results in ALS. We identified eight SNPs in one cis eQTL, modulating CYP27A1 gene expression levels, which replicated in the second eQTL dataset and second GWAS set. The eQTL SNPs within this locus are part of a large linkage disequilibrium (LD) block comprising a total of ten genes (Figure 2). The figure clearly shows that the strongest eQTL associations exist for SNPs modulating CYP27A1 expression, explaining up to 65% of variation in gene expression of this gene. Additionally, we show that C9orf72 appears to be the only gene in the 9p21.2 locus that is regulated in cis, showing the potential of this approach in identifying causative genes in association loci in ALS. As shown in Table S3, the SNPs modulating transcript levels had small effect sizes in our joint GWAS association results, the highest odds ratio (OR) being 1.13. We used PS v3.038 for statistical power cal- culations to determine the required sample size for a third genotypic replication of such SNPs. In order to replicate an association for one SNP with minor allele frequency 0.35 at α=0.05, one would require a min- imum of 2,250 cases and 2,250 controls to achieve 80% power for detecting an effect with OR 1.13. As shown in Table 1, several eQTL SNPs did not reach Bonferroni corrected significance in the replication data alone, but do show stronger effects in the joint GWAS data, indicating that statistical power of the GWAS replication set might be a limiting factor. By testing these SNPs in a third independent replication cohort, additional true associations may be detected. The required sample size for such an effort would, however, increase dramatically when adding more tests. Further international collaboration, therefore, is needed in

48 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS order to achieve sufficient statistical power for the replication of SNPs with small effect sizes. We searched MEDLINE, and OMIM databases to identify links to known pathways in ALS pathogenesis for CYP27A1. The CYP27A1 gene is involved in cholesterol metabolism and has been associated with cerebrotendinous xanthomatosis (CTX), which can present with progressive upper motor neuron signs and is a known clinical mimic for primary lateral sclerosis.39,40 Two heterozygous mutations in CYP27A1 have been reported in a patient with atypical CTX and frontotemporal dementia characteristics.41 Furthermore, previously, serum cholesterol levels have been implicated in modifying survival and in the onset of respiratory impairment in ALS patients.42-44 The combination of our results and these prior data make CYP27A1 a plausible candidate gene for ALS. The strengths of our study are the meticulous pruning of expression probes as present on the ex- pression array, with regard to non-specific mapping in the human transcriptome, or harboring SNPs that might interfere with hybridization of probes to the array, resulting in false-positive eQTLs.33 In addition, permutation schemes were applied, preserving the LD structure within subjects, also minimizing the detec- 4 tion of false-positive eQTLs. Finally, a two-stage approach, both for eQTLs discovery purposes and for the detection of novel SNP-ALS associations, ensures robustness of the results. A drawback of the present study lies in the use of whole blood instead of neuronal tissue for the measurement of mRNA expression levels. As neuronal tissue is inaccessible in living ALS patients, one could consider the use of human neuronal tissue from autopsy. However, in post-mortem material of ALS patients, most affected motor neurons will have degenerated and one would be investigating exclusively end-stage disease expression profiles. We have investigated the proportion of overlapping eQTLs between our study and other studies, including two studies on human brain tissue (Table S4).24,26,45,46 Studies of the genetics of gene expression appear to have modest overlap in the eQTLs identified. For example, 36.1% of genes mapped by a cis eQTL in lymphocytes were identified in a study using lymphoblastoid cell lines.24,45 A small- er overlap (22%) was found between two studies on brain tissue, which may partly be due to low statistical power.26,46 In the present study, 37 – 52% of the genes mapped by cis eQTLs in human brain tissue studies appeared to be present in our data (Table S4). The proportion of overlap with studies on blood-derived tissues was comparable (41 – 45%). Considering the relatively high concordance of genes mapped by cis eQTLs in our screen with those found in human brain tissue, we consider blood to be a valid starting point for genetic mapping of gene expression in ALS. A large collection of central nervous system tissue control samples may, however, further boost the discovery of novel genetic variants that are associated with ALS. The focused analysis of variants in the chromosome 9p21.2 locus, which was previously associated with ALS7,12, did not identify rs2814707 or rs3849942 as eQTL SNPs. We did, however, find evidence of two other SNPs (rs10122902 and rs1565948), located within a large LD block surrounding the previous- ly associated markers, to be correlated with altered expression levels of C9orf72 isoform a. SNP rs1565948 was associated with ALS in our joint GWAS data. The rs10122902 variant was not associated with ALS in our joint GWAS, but was previously shown to be part of a haplotype with rs3849942, in which the major

49 allele of rs10122902 was associated with increased risk of ALS.12 Genetic variation in the chromosome 9p21.2 locus, therefore, appears to be associated with altered gene expression of C9orf72. The recent discovery of the intronic hexanucleotide repeat expansion in C9orf72 on a common haplotype in 9p21.2 linked families with ALS and FTD15,16,47 thus illustrates the potential of the combined use of gene expression and genotyping in search for causative genes in human diseases. The mechanism though of the recently discovered repeat expansion in C9orf72 remains to be established. There could be a direct effect of expres- sion levels of isoforms of C9orf72, or a “trans”-like effect through RNA-toxicity, as shown in other repeat expansions diseases including fragile X-associated tremor/ataxia syndrome (FXTAS).48 Other types of ex- periments are needed to elucidate this mechanism. In summary, our genome-wide study of the genetics of gene expression has identified one cis eQTL for sporadic ALS, which modulates CYP27A1 expression and additionally points to C9orf72 in the chromo- some 9p21.2 locus as the gene involved in ALS pathogenesis. To further identify eQTLs relevant to ALS, the concomitant analysis of epigenetic and other level -omic data, e.g. proteomic or metabonomic can be used, as recently shown in a model organism.49 These studies are preferably performed in ‘ALS target tissues’, including post-mortem central nervous system tissues and induced pluripotent stem cells differentiated to a neuronal or glial lineage. Such studies may provide us with more insight into novel pathogenic pathways and networks causal to this devastating disease.

50 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

REFERENCES 1. Byrne S, Walsh C, Lynch C, Bede P, Elamin M, et al. Rate of familial amyotrophic lateral sclerosis: a systematic review and meta-analysis. J Neurol Neurosurg Psychiatry 2011;82:623-627. 2. Dion PA, Daoud H, Rouleau GA. Genetics of motor neuron disorders: new insights into pathogenic mechanisms. Nat Rev Genet 2010;10:769-782. 3. Schymick JC, Talbot K, Traynor BJ. Genetics of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2007;16 (Spec No. 2):R233-R242. 4. Dunckley T, Huentelman MJ, Craig DW, Pearson JV, Szelinger S, et al. Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med 2007;357:775-788. 5. van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol 2007;6:869-877. 6. van Es MA, van Vught PWJ, Blauw HM, Franke L, Saris CGJ, et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet 2008;40:29-31. 7. van Es MA, Veldink JH, Saris CGJ, Blauw HM, van Vught PWJ, et al. Genome-wide association study 4 identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087. 8. Simpson CL, Lemmens R, Miskiewicz K, Broom WJ, Hansen VK, et al. Variants of the elongator protein 3 (ELP3) gene are associated with motor neuron degeneration. Hum Mol Genet 2009;18:472-481. 9. Chiò A, Schymick JC, Restagno G, Scholz SW, Lombardo F, et al. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2009;18:1524-1532. 10. Cronin S, Tomik B, Bradley DG, Slowik A, Hardiman O. Screening for replication of genome-wide SNP associations in sporadic ALS. Eur J Hum Genet 2009;17:213-218. 11. Fernández-Santiago R, Sharma M, Berg D, Illig T, Anneser J, et al. No evidence of association of FLJ10986 and ITPR2 with ALS in a large German cohort. Neurobiol Aging 2011;32:551.e1-e4. 12. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 13. van Es MA, van Vught PWJ, Veldink JH, Andersen PM, Birve A, et al. Analysis of FGGY as a risk factor for sporadic amyotrophic lateral sclerosis. Amyotroph Lateral Scler 2009;10:441-447. 14. Laaksovirta H, Peuralinna T, Schymick JC, Scholz SW, Lai S-L, et al. Chromosome 9p21 in amyotrophic lateral sclerosis in Finland: a genome-wide association study. Lancet Neurol 2010;9:978-985. 15. Dejesus-Hernandez M, Mackenzie IR, Boeve BF, Boxer AL, Baker M, et al. Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 2011;72:245-256. 16. Renton AE, Majounie E, Waite A, Simón-Sánchez J, Rollinson S, et al. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011;72:257-268. 17. Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M. Mapping complex disease traits with global gene expression. Nat Rev Genet 2009;10:184-194.

51 18. Jansen RC, Nap JP. Genetical genomics: the added value from segregation. Trends Genet 2001;17:388-391. 19. Nica AC, Dermitzakis ET. Using gene expression to investigate the genetic basis of complex disorders. Hum Mol Genet 2008;17:R129-R134. 20. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 2010;6:e1000888. 21. Moffatt MF, Kabesch M, Liang L, Dixon AL, Strachan D, et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 2007;448:470-473. 22. Barrett JC, Hansoul S, Nicolae DL, Cho JH, Duerr RH, et al. Genome-wide association defines more than 30 distinct susceptibility loci for Crohn’s disease. Nat Genet 2008;40:955-962. 23. Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 2005;37:710-717. 24. Göring HHH, Curran JE, Johnson MP, Dyer TD, Charlesworth J, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes.Nat Genet 2007;39:1208-1216. 25. Dubois PCA, Trynka G, Franke L, Hunt KA, Romanos J, et al. Multiple common variants for celiac disease influencing immune gene expression.Nat Genet 2010;42:295-302. 26. Webster JA, Gibbs JR, Clarke J, Ray M, Zhang W, et al. Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet 2009;84:445-458. 27. Brooks BR. El Escorial World Federation of Neurology criteria for the diagnosis of amyotrophic lateral sclerosis. Subcommittee on Motor Neuron Diseases/Amyotrophic Lateral Sclerosis of the World Federation of Neurology Research Group on Neuromuscular Diseases and the El Escorial “Clinical limits of amyotrophic llateral sclerosis” workshop contributors. J Neurol Sci 1994;124 (suppl.):96-107. 28. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. 29. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, et al. Principal components analysis corrects for stratification in genome-wide association studies.Nat Genet 2006;38:904-909. 30. Bolstad BM, Irizarry RA, Astrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003;19:185-193. 31. Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res 2002;12:656-664. 32. Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 2007;3:1724-1735. 33. Alberts R, Terpstra P, Li Y, Breitling R, Nap J-P, et al. Sequence polymorphisms cause many false cis eQTLs. PLoS ONE 2007;2:e622. 34. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’Donnell CJ, et al. SNAP: a web-based tool for identifi- cation and annotation of proxy SNPs using HapMap. Bioinformatics 2008;24:2938-2939. 35. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, et al. Detection of large-scale variation in the human genome. Nat Genet 2004;36:949-951. 36. Breitling R, Li Y, Tesson BM, Fu J, Wu C, et al. Genetical genomics: spotlight on QTL hotspots. PLoS Genet 2008;4:e1000232.

52 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

37. Fransen K, Visschedijk MC, van Sommeren S, Fu JY, Franke L, et al. Analysis of SNPs with an effect on gene expression identifies UBE2L3 and BCL3 as potential new risk genes for Crohn’s disease. Hum Mol Genet 2010;19:3482-3488. 38. Dupont WD, Plummer WD. Power and sample size calculations. A review and computer program. Control Clin Trials 1990;11:116-128. 39. Cali JJ, Hsieh CL, Francke U, Russell DW. Mutations in the bile acid biosynthetic enzyme sterol 27-hydrox- ylase underlie cerebrotendinous xanthomatosis. J Biol Chem 1991;266:7779-7783. 40. Gallus GN, Dotti MT, Federico A. Clinical and molecular diagnosis of cerebrotendinous xanthomatosis with a review of the mutations in the CYP27A1 gene. Neurol Sci 2006;27:143-149. 41. Guyant-Maréchal L, Verrips A, Girard C, Wevers RA, Zijlstra F, et al. Unusual cerebrotendinous xanthomatosis with fronto-temporal dementia phenotype. Am J Med Genet 2005;139A:114-117. 42. Chiò A, Calvo A, Ilardi A, Cavallo E, Moglia C, et al. Lower serum lipid levels are related to respiratory impairment in patients with ALS. Neurology 2009;73:1681-1685. 43. Dupuis L, Corcia P, Fergani A, Gonzalez De Aguilar J-L, Bonnefont-Rousselot D, et al. Dyslipidemia is a 4 protective factor in amyotrophic lateral sclerosis. Neurology 2008;70:1004-1009. 44. Dupuis L, Pradat P-F, Ludolph AC, Loeffler J-P. Energy metabolism in amyotrophic lateral sclerosis.Lancet Neurol 2011;10:75-82. 45. Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, et al. Genome-wide associations of gene expression variation in humans. PLoS Genet 2005;1:e78. 46. Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet 2010;6:e1000952. 47. Mok K, Traynor BJ, Schymick J, Tienari PJ, Laaksovirta H, et al. The chromosome 9 ALS and FTD locus is probably derived from a single founder. Neurobiol Aging 2012;33:209.e3-e8. 48. Qurashi A, Li W, Zhou J-Y, Peng J, Jin P. Nuclear accumulation of stress response mRNAs contributes to the neurodegeneration caused by Fragile X premutation rCGG repeats. PLoS Genet 2011;7:e1002102. 49. Fu J, Keurentjes JJB, Bouwmeester H, America T, Verstappen FWA, et al. System-wide molecular evidence for phenotypic buffering in Arabidopsis.Nat Genet 2009;41:166-167. 50. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005;21:263-265.

53 SUPPORTING INFORMATION TEXT S1. GWAS QUALITY CONTROL. The following quality control measures were applied to the genome-wide genotype data. Quality control was performed for the discovery and replication data separately.

1. MERGING DATASETS Only SNPs common to all datasets were extracted. Tri-allelic SNPs or SNPs with A/T or C/G alleles were removed to prevent the occurrence of allele swaps. Subsequently, datasets were merged per country. After each merge, a flipscan was performed using PLINK software to check for possible allele swaps.1

2. REMOVAL OF DUPLICATE SAMPLES SNPs on chromosome 22 were used for an identity-by-descent (IBD) analysis in PLINK. Pairs of individuals with a relatedness measure (pi-hat) value >0.9 were considered to be indicative of a duplicate sample. From these pairs, one of the individuals was randomly removed from the data.

3. SNP MARKER QUALITY CONTROL SNPs with a minor allele frequency (MAF) <5%, or with a genotyping call rate <95%, or not in Hardy-Wein- berg equilibrium in controls (test p<1×10−4) were removed.

4. SAMPLE QUALITY CONTROL Samples where gender was not defined in the phenotype file, or with a genotyping call rate <95% were re- moved. Additionally, inbreeding coefficients (F) were calculated in PLINK, and samples with high (F>0.05) or low (F<−0.025) heterozygosity rates were excluded.

5. DIFFERENTIAL MISSINGNESS SNPs were tested for differing missing data rates between cases and controls, and SNPs with a test p<1×10−3 were removed. Subsequently, a haplotype-based test for non-random missing genotype data was performed in PLINK, and SNPs with estimated haplotype frequencies >2%, and a test p<1×10−10 were excluded.

6. GENDER CHECK Genetic gender (based on heterozygosity rates of X chromosome SNPs) was compared to the gender reported in the phenotype file, and samples with mismatches were removed.

7. CHECK FOR RELATEDNESS BETWEEN INDIVIDUALS For this analysis, a subset of SNPs in approximate linkage equilibrium was selected by linkage disequilib-

54 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS rium (LD)-based SNP pruning in PLINK. Autosomal SNPs with a genotyping call rate >0.999, MAF >5%, and a 100% call rate per sample were LD-based pruned using PLINK’s default settings. In the discovery data, this resulted in a pruned set of 31,362 SNPs, and in the replication data, this subset consisted of 34,400 markers. The pruned set of markers was used for an IBD analysis in PLINK. For pairs of individuals with a pi-hat value >0.2 the individual with the lowest genotyping rate was removed.

8. INDENTIFY POPULATION SUBSTRUCTURE Population substructure was assessed by principal components analysis using the EIGENSTRAT program from the EIGENSOFT v3.0 software package.2 Genotypes for the previously generated subset of markers were merged with genotypes for different populations included in HapMap phase III release 2 (1,184 indi- viduals), which were used as a reference. Plots were generated of the first two principal components, and exclusion thresholds for population outliers 4 were defined by visual inspection. Samples identified as population outliers were removed, and principal components analysis was reapplied to the remaining individuals.

Details of quality control statistics for the GWAS cohorts are summarized in Table S5.

55 Tabel S1 Expression study populations

Gender, female Age, mean n n % y Discovery ALS cases 162 61 38 63.9 Controls 207 90 43 62.7 Replication ALS cases 161 67 42 63.6 Controls 206 97 47 62.0 Total 736 315 43 63.2 ALS, amyotrophic lateral sclerosis.

Table S2 GWAS populations and genotyping platforms

n ALS cases n Controls Platform Discovery The Netherlands 1016 7069 Illumina 317K, 370K, 550K Belgium 300 328 Illumina 370K Sweden 458 455 Illumina 370K Ireland 220 209 Illumina 550K United States 267 267 Illumina 550K Total 2261 8328 Replication France 231 709 Illumina 317K United Kingdom 239 212 Illumina 317K United States 736 791 Illumina 317K Ireland 101 123 Illumina 610K Total 1307 1835 Joint GWAS 3568 10163 ALS, amyotrophic lateral sclerosis; GWAS, genome-wide association study.

56 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS Results for replicated eQTLs associated with CYP27A1 expression levels Table S3 cis eQTL overlap with previous studies Table S4 Locus Chr. 2 Chr. CYP27A1 Com Gi Webster Str Göri Study Th is study, stages two expre ss i expre ss i association results in the joint data were based on a total GWAS of 3,568 patients ALS 10,163 and controls. The gene expressi OR, ratio; odds eQTL, expression quantitative trait locus; ALS, amyotrophic l The direction of effect for all SNP discovery and repli bb s ange r ng pa ris

e e t on . on qu t al. e , on o e al.

t t 6 al. al.

3 The num

5 f t 4 an tit a

he num

ILMN_1704985

identifier Illumina probe

ve tra ti ve cation datasets eQTL combined. be r o be r o

f g it l f g s m ene s ; F oc us ; s m ene s apesz Ti size Sample W 36 7 369 + n 0 F 600 36 4 27 0 1, 24 0 by apped by - DR , transcript pairs the was same; for each SNP, the minor allele associated was with increased

rs4674345 rs3770214 SNP rs921968 rs4674338 rs1863704 rs7607369 rs2303565

rs1554622 by apped by f alse dis

cis

co very

cis

eQT

eQT Brain Ly mph Ly mpho ou r eg i ssu e oo d bl ho le was ba s Ls was CYP27A1

G A allele Minor A A C G

A C r ate; LOD, logar

in t Ls in rtex co rtex ob lastoid ce ll tes cy tes

s in in on s he pres , cytochrome P450, family 27, subfamily A, polypeptide 1; Chr., chromosome; LD, linkage disequilibrium; GWAS, genome ed SNP SNP LD with index 0.39 r 1 0.54 0.55 0.58 0.66 (rs4674345) 0.79 0.66 hu m

2 on t

en t study

ban F an brain

hm o it hm cance thres signifi he li ne s 4 s. f odd s. ateral sclerosis.

and f ou 1.09 1.08 1.10 1.10 1.09 1.09 1.09 associationSNP discoveryGWAS OR 1.09 r ot

he r studies t pe rmut pe rmut score LOD n g pe rmut si t was was t ha ho ld DR 0.05 gn .

0.020 0.049 0.016 0.016 0.017 0.020

0.027 p 0.020 ene s m

t hres p<0.05 ed p<0.05 ed p<0. ed p

ld ho ld by apped by > 3 ha t

F 0.05 2,211 2,211 DR 0.05 in in us ed 00 1 ve l ha ve 1.15 1.23 1.17 1.18 1.19 1.22 1.19 association replicationGWAS SNP

OR 1.16

cis into t he g into ooked

each o eTs Overl eQTLs 2.37 0.012 2.25 6.46 × 3.95 × 1.06 1.00 1.32 p

f t × he st × × × × n 28 0 29 9 737 1 146 146 28 1 10 10 10 10 10 10 10 − − − − − − − ud ies. eQ 4 3 3 3 3 4 3

cs o ene ti cs

on explained variance (R

CYP27A1 103 103 122 33 2 – results results 1.11 association Joint SNP GWAS 1.11 1.12 1.12 1.11 1.12 1.13 OR 1.12 TL , f g

ing app ing ene expression levels. % .8 36 .8 40.8 45 .0 – .0 52 .0 7.88 7.51 3.14 2.61 3.98 × 1.84 8.79 p 2.26

× × × × × × × 10 10 10 10 10 10 10 10 − − − − − − − − 4 4 4 4 5 4 4 4

2

) estimated was from expression data from both LD estimates with SNP rs4674345 and SNP Discovery 1.75 1.06 permutations p valueeQTL after 1.70 1.05 6.27 × 2.93 1.65 2.35 × × × × × × × 10 10 10 10 10 10 10 10 − − − − − − − − 31 46 42 31 26 35 56 42

Replication 2.34 3.98 1.26 3.07 1.08 1.27 1.19 1.71 × × × × × × × × 10 10 10 - 10 10 10 10 10 wide association study; − − − − − − 22 − − 47 51 42 32 35 27 28

variance Expression 0.48 Combined dataCombined 0.52 0.59 0.52 0.55 0.65 0.62 explained (R 0.57

2 )

57 Details of quality control of genome-wide genotype data Table S5 R To t N U I I U B N S D U F N D C To t N r r r e w o a e e e S S K i e e e e anc e s l p t h l l t t t t

A A e cove r a a g an d an d a he r he r he r he r Q o li c d

i l l s u

r C e e t m a , n t l l l l

,

an d an d an d an d

ti o

co u q y ua lit y

n s s s s

n

t r y

con t

N M U B D B D G U R U K

H L E S U U

R ondo n v i eau m eau m ou r o S u u o a M n M m M I GH n H r b b tt e s s i - y g' C C C r e v I

t p li n li n o

e hu i a c

s it a co h r l U U U & r

e ; da m

s U C on t on t

t t t

A s

it y l A r r r o n

b L ech t ech t ech t o tl an t ll e i e S

v

r

H H H t r e , S ,

g g

r o o o a t T e

s u s s s a m h it y

p p p d

e yo t it a it a it a y

l l l r , , o

p h i 226 1 26 7 7 0 22 0 1 0 58 2

45 8 24 5 25 1 P 3 46 1 A 1 c 0 35 2 5 r 0 L

3 e 3 S l 0 a

Q t

e C r 832 8

26 7 8 59 7 2 1 22 1 7 32 8 62 9 45 0

45 5 7 C 1 a

2 88 3 0 2 0 1 O l 7 1

4 4 9 s

N

4

c

l e r o

55535 1 56 1 3 3 56 1 3 3 62 0 3 56 1 3

3

S s 0 0 1 7 7 7 0 N i 7 0 0 0 s 77 9 77 9 77 9 P 5 46 6 46 6 46 6 ; 4 4 4

9 s 0 C 0 0 0

0 3 O 0 0 0 4 4 4 1

N ,

3

3 3 3 3 3 3 3 3 3 Q 3 3 3

3 C con t 11 39 5 11 39 5 11 39 5 0 0 11 39 5 11 39 5 11 39 5 0 0 11 39 5 11 39 5 11 39 5 0 o C mm o 1 1 1 1 1

68 6 68 6 68 6 68 6 68 6 S r N o

P l

n ;

s

M

A F 8 8 8 77 3 77 3 77 3 8 8

8 77 3 8 8 8 M 77 3 ,

1 1 1 1 1 1 1 1 1 m A 3 3 3 3 3 3 3 3 3 F 1 1 1 1 1 1 1 1 1 i 5 5 5 5 5 no r

a 344 0 344 0 344 0 2 2 2 344 0 344 0

344 0 H 2 344 0 344 0 2 344 0 ll e 0 0 0 0 0 W 6 6 6 6 6 E l 0 0 0 0 0 e

fr e 40 9

40 9 40 9 1 1 40 9 40 9 40 9 1 1 40 9 40 9

r C 1 40 9 25 7 25 7 25 7 25 7 25 7 a q a t uency ; ll e 1 1 1 1 1 1 1 1 1

9 9 9 9 9

22 0

22 0 22 0 983 9 983 9 22 0 22 0 22 0 983 9 983 9 22 0 22 0

A M 983 9 22 0

H L i ss i W S / 4 4 4 4 4 4 4 4 4 E C ng 0 0 0 0 0 0 0 0 0

,

O

H N a r d y - W 589 1

589 1 589 1 45 7 45 7 589 1 589 1 589 1 45 7 45 7 589 1 589 1

ha p M 45 7 589 1 i e ss i i 1 1 1 1 1 nbe r l o

ng t y p b g e y

Eq u

3

3 0 0 0 1 0 8 0 5 Q 1 8

ca t D 5 3 ili b 4

7

u C

p

e r l s s i i a - u

m m p .

4

1 0 0 0 0 0 8 3 1 0 2 2

r C 1 l

9 9 e a 1 7

a

t s

ll e

7 6

2 1 2 2 1 4 6 1 7 zygo s H 2 0 1

0 2 1 1 2

3 2

e

t e r o it y -

6 0

5 3 0 1 6 1 3 4 chec k S 1 2 8 3

5 1

2 6

2

e

x

5 0

8 3 0 1 0 0 1 0 R 0 0 3 53 4

2 0 7

e

0 l a

t e d

7 3

1 6 1 2 5 1 8 o Pop u 6 0 2 2

2 7

4 9 2 u 5

0

t

li e l r a s t

i on 226 1 26 7

56 6 45 0 22 0 45 8 23 9 7 0 A 23 1 1 1 3 A

0 0 3 3 0 L f

0 1 t 6 S 0

e

7

r Q

C 832 8 26 7

59 7 43 6 2 45 5 2 7 64 0 C 1 1 32 8 7

539 6

2 83 5 9 0 0 1 O 2 3 1 9 9 N

26895 2 26895 2

26895 2 26895 2 26895 2 26649 2 26649 2 26895 2 26895 2 S 26649 2 26895 2 26649 2

26649 2 26895 2 N P s

58 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

Figure S1 Manhattan plot of autosomal SNP association p values in the GWAS discovery set

4

On the X-axis genomic positions of SNPs aligned to NCBI genome build 36, chromosome borders are designated by changing dot colors. On the Y-axis -log10 (p values) for association between SNP genotype and disease status as obtained from logistic regression analyses in the GWAS discovery set. The dotted line indicates the threshold for genome-wide significance (p=5×10-8). GWAS, genome-wide association study.

Figure S2

Quantile-quantile plot of observed -log10 (p values) versus the expectation under the null for the genome-wide association results in the GWAS discovery set

The figure shows departure from the null distribution with λGC=1.032. GWAS, genome-wide association study.

59 Figure S3 Plots for SNP genotype vs. expression level correlations for eQTL SNPs modulating CYP27A1 expression levels

On the Y-axis, the residuals of log2 transformed expression levels for probe ILMN_1704985 mapping to CYP27A1 after regression of covariates in the replication data. On the X-axis SNP genotype bins, according to an additive model; on the left homozygotes for the major allele and homozygotes for the minor allele on the right. A regression line is plotted for each linear model. P values and R2 (variance explained) for GWAS and eQTL associations in both discovery and replication cohorts are shown below each plot. Disc, Discovery; Repl, Replication; eQTL, expression quantitative trait locus; GWAS, genome-wide association study.

60 MAPPING OF GENE EXPRESSION REVEALS CYP27A1 AS A SUSCEPTIBILITY GENE FOR SPORADIC ALS

Figure S4 Plots for SNP genotype vs. expression level correlations for eQTL SNPs modulating C9orf72 expression levels

4

On the Y-axis, the residuals of log2 transformed expression levels for probe ILMN_1741881 mapping to C9orf72 after regression of covariates in the replication data. On the X-axis SNP genotype bins, accor- ding to an additive model; on the left homozygotes for the major allele and homozygotes for the minor allele on the right. A regression line is plotted for each linear model. P values and R2 (variance explained) for GWAS and eQTL associations in both discovery and replication cohorts are shown below each plot. Disc, Discovery; Repl, Replication; eQTL, expression quantitative trait locus; GWAS, genome- wide association study

SUPPLEMENTARY REFERENCES 1. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559- 575. 2. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006; 38: 904-909 3. Göring HHH, Curran JE, Johnson MP, Dyer TD, Charlesworth J, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes.Nat Genet 2007;39:1208- 1216. 4. Stranger BE, Forrest MS, Clark AG, Minichiello MJ, Deutsch S, et al. Genome-wide associations of gene expression variation in humans. PLoS Genet 2005;1:e78. 5. Webster JA, Gibbs JR, Clarke J, Ray M, Zhang W, et al. Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet 2009;84:445-458. 6. Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet 2010;6:e1000952

61

PART II Genetic pleiotropy

5

NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MULTIPLE SCLEROSIS AND AMYOTROPHIC LATERAL SCLEROSIS

HUMAN MOLECULAR GENETICS. 2014;23(7):1916-22

Frank P Diekstra*, An Goris*, Jessica van Setten*, Stephan Ripke*, Nikolaos A Patsopoulos, Stephen Sawcer, Michael van Es, The Australia and New Zealand MS Genetics Consortium, Peter Andersen, Judith Melki, Vincent Meininger, Orla Hardiman, John Landers, Robert Brown Jr, Ammar Al-Chalabi, Bryan Traynor, Adriano Chio, Roel Ophoff, Jorge Oksenberg, Philip Van Damme, Alastair Compston, Wim Robberecht, Benedicte Dubois, Leonard H van den Berg, Philip L De Jager, Jan H Veldink, Paul IW de Bakker

* These authors contributed equally to this work ABSTRACT Genome-wide association studies have been successful in identifying common variants that influence the susceptibility to complex diseases. From these studies, it has emerged that there is substantial overlap in susceptibility loci between diseases. In line with those findings, we hypothesized that shared genetic pathways may exist between multiple sclerosis (MS) and amyotrophic lateral sclerosis (ALS). While both diseases may have inflammatory and neurode- generative features, epidemiological studies have indicated an increased co-occurrence within individuals and families. To this purpose, we combined genome-wide data from 4088 MS pa- tients, 3762 ALS patients and 12 030 healthy control individuals in whom 5 440 446 single- nucleotide polymorphisms (SNPs) were successfully genotyped or imputed. We tested these SNPs for the excess association shared between MS and ALS and also explored whether polygenic models of SNPs below genome-wide significance could explain some of the observed trait variance between diseases. Genome-wide association meta-analysis of SNPs as well as polygenic analyses fails to provide evidence in favor of an overlap in genetic susceptibility between MS and ALS. Hence, our findings do not support a shared genetic background of common risk variants in MS and ALS.

INTRODUCTION Multiple sclerosis (MS, OMIM: 126200) is a common disease of the central nervous system characterized by inflammation, demyelination and axonal loss.1 Large extended families with the disease are extremely rare2, but a genetic component in susceptibility to MS has been clearly demonstrated.1 Currently known risk variants include four classical human leukocyte antigen (HLA) alleles and >50 single-nucleotide polymor- phisms (SNPs) outside the HLA region.3,4 Amyotrophic lateral sclerosis (ALS, OMIM: 105400) is a neurodegenerative condition with devas- tating impact. Multiple cellular events contribute to the pathobiology, including mitochondrial dysfunction, excitotoxicity, protein aggregation in the cytosol, impaired axonal transport, neuroinflammation and dysreg- ulated RNA signaling.5 About 10 – 20% of cases are familial, and up to 50% of these can be explained by known mutations in 18 genes including SOD1, FUS, TARDBP and C9orf72.6 The majority of patients are iso- lated cases, however. Not all results from genome-wide association studies (GWAS) have been replicated, but two regions of association have been confirmed in independent studies: a locus on chromosome 9 and variation in the UNC13A region.7-11 One of the lessons learned in the GWAS era is the substantial overlap in susceptibility loci between diseases. This has been demonstrated for immune-related12,13, metabolic14 and psychiatric15 disorders and indicates, sometimes unexpectedly, commonalities and differences between diseases. MS indeed shares several susceptibility loci with other immune-related disorders, including type 1 diabetes and Crohn’s dis-

66 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS ease.3 However, besides the immune component, key features of neurodegeneration, i.e. axonal transection, neuronal cell atrophy and neuronal death, are early pathological events in MS.1 Moreover, the irreversible disability seen in patients correlates stronger with neuronal damage than with inflammatory demyelina- tion16, although the cause of the neuronal damage remains elusive. On the other hand, for diseases clas- sified as neurodegenerative such as ALS, an inflammatory or immune component has been implicated but is not yet conclusive.17,18 Case reports have described patients affected by both diseases19-24 and an increased co-occurrence of MS and ALS compared with what is expected has been observed.25,26 Studies also report an increased risk of MS among relatives of patients suffering from ALS and vice versa27-29, and some but not all studies report geographical correlation in mortality rates of both diseases.30,31 In order to assess the shared genetic contribution between MS and ALS, possibly through common pathways of neurodegeneration or inflammation, we investigated the overlap of common susceptibility variants using available GWAS data.

RESULTS We first investigated previously reported3,4,7-11 susceptibility loci in one disease for evidence of association in the other. None of the reported ALS susceptibility loci show evidence for association with MS (Table 1). Out of 56 established, independent MS susceptibility loci3,4, 4 (7.1%) show nominal significance for 5 association with ALS, but none survived multiple testing for the number of SNPs investigated (Table 2). As expected because of the overlap between the datasets used here and those used in the original studies of each disease separately, all previously reported risk factors for either MS or ALS show the same direction of effect for the respective disease in this dataset as in the original studies. Regarding the other disease, 4/5 reported ALS risk SNPs show the same direction of effect in MS as in ALS (sign test P = 0.38), and among established MS-associated SNPs, 26/56 (46%) SNPs show the same direction of effect in ALS (sign test P = 0.69). Four SNPs were previously highlighted for reaching suggestive P-values of <10-5 for association with disease course (bout onset versus primary progressive MS).3 Only one of these shows evidence for association with ALS but in the opposite direction (data not shown).

Table 1 Evidence of association for reported ALS susceptibility loci with MS

chr rsid position (hg19) Gene Risk Allele P ALS OR ALS P MS OR MS 9 1 rs6700125 59702797 FGGY T 0.087 1.06 0.085 1.06 10 7 rs10260404 154210798 DPP6 C 0.0049 1.10 0.55 1.02 7,8 9 rs3849942 27543281 C9orf72 T 5.8E-06 1.19 0.26 1.04 11 12 rs2306677 26636386 ITPR2 A 0.080 1.10 0.60 1.03 7 19 rs12608932 17752689 UNC13A C 8.3E-09 1.21 0.39 0.97

67 Table 2 Evidence of association for independent, established MS susceptibility loci with ALS

chr rsid position (hg19) Gene Risk P MS OR P ALS OR Allele MS ALS 1 rs4648356 2709164 MMEL1 (TNFRSF14) C 0.012 1.09 0.97 1.00 1 rs11810217 93148377 EVI5 T 0.00032 1.14 0.12 0.94 1 rs11581062 101407519 SLC30A7 G 0.032 1.08 0.025 1.08 1 rs1335532 117100957 CD58 A 1.2E-08 1.35 0.97 1.00 1 rs1323292 192541021 RGS1 A 0.0098 1.11 0.53 1.03 1 rs7522462 200881595 C1orf106 G 0.00083 1.13 0.023 0.92 2 rs6718520a,4 43325570 ZFP36L2 (THADA) A 1.2E-05 1.16 0.84 1.01 2 rs12466022 43359061 ZFP36L2 (THADA) C 4.2E-05 1.16 0.76 0.99 2 rs7595037 68647095 PLEK T 1.6E-05 1.15 0.32 0.97 2 rs17174870 112665201 MERTK C 0.00012 1.15 0.79 1.01 2 rs10201872 231106724 SP140 T 0.00056 1.15 0.13 1.07 3 rs669607 28071444 intergenic C 2.5E-05 1.15 0.57 0.98 3 rs2028597 105558837 CBLB G 0.56 1.03 0.52 1.04 3 rs2293370 119219934 C3orf1 G 0.056 1.08 0.29 0.96 3 rs9282641 121796768 CD86 G 0.0015 1.22 0.52 0.96 3 rs2243123 159709651 IL12A C 0.17 1.05 0.25 1.04 4 rs228614 103578637 MANBA G 0.0092 1.18 0.23 0.625 5 rs6897932 35874575 IL7R C 0.0014 1.12 0.20 0.96 5 rs4613763 40392728 PTGER4 C 0.00014 1.19 0.87 0.99 5 rs2546890 158759900 IL12B A 3.8E-06 1.16 0.78 1.01 6 rs12212193 90996769 BACH2 G 0.0055 1.09 0.14 1.05 6 rs802734 128278798 PTPRK A 0.0014 1.12 0.89 1.00 6 rs11154801 135739355 AHI1 A 0.014 1.08 0.49 0.98 6 rs17066096 137452908 IL22RA2 G 0.00096 1.13 0.29 0.96 6 rs1738074 159465977 TAGAP C 0.00075 1.12 0.45 0.98 7 rs354033 149289464 ZNF767 G 0.00079 1.13 0.26 1.04 8 rs1520333 79401038 PKIA G 0.11 1.06 0.41 1.03 8 rs4410871 128815029 MYC C 0.018 1.09 0.54 1.02 9 rs2150702 5893861 MLANA G 2.5E-05 1.14 0.015 1.08 10 rs3118470 6101713 IL2RA C 0.00078 1.12 0.76 1.01 10 rs1250550 81060317 ZMIZ1 A 0.0024 1.11 0.66 0.98 10 rs7923837 94481917 HHEX G 0.015 1.08 0.18 0.96 11 rs650258 60832282 CD5 C 0.00018 1.14 0.097 0.95 11 rs630923 118754353 CXCR5 C 0.033 1.11 0.066 1.08 12 rs1800693 6440009 TNFRSF1A G NAb NA 0.67 1.01 12 rs10466829 9876091 CLECL1 A 0.0009 1.11 0.49 0.98 12 rs12368653 58133256 AGAP2 A 0.0018 1.10 0.31 0.97 12 rs949143 123595163 ARL6IP4 G 0.015 1.08 0.57 0.98 14 rs4902647 69254191 ZFP36L1 C 0.00022 1.12 0.72 0.99 14 rs2300603 76005557 BATF T 0.014 1.10 0.10 0.94 14 rs2119704 88487689 GPR65 C 0.045 1.13 0.23 0.93 16 rs2744148 1073552 SOX8 G 0.023 1.10 0.30 0.95 16 rs7200786 11177801 CLEC16A A 8.8E-05 1.14 0.58 0.98 16 rs13333054 86011033 IRF8 T 0.063 1.09 0.98 1.00 17 rs9891119 40507980 STAT3 C 0.00016 1.13 0.86 0.99 17 rs180515 58024275 RPS6KB1 G 0.093 1.06 0.74 1.01 18 rs7238078 56384192 MALT1 T 0.00075 1.13 0.99 1.00 19 rs1077667 6668972 TNFSF14 C 0.033 1.10 0.10 0.94 19 rs8112449 10520064 CDC37 G 0.14 1.05 0.83 0.99 19 rs874628 18304700 MPV17L2 A 0.021 1.09 0.65 0.98 19 rs2303759 49869051 DKKL1 G 0.0075 1.11 0.034 1.08 20 rs2425752 44702120 NCOA5 T 0.0001 1.14 0.40 0.97 20 rs2248359 52791518 CYP24A1 C 0.00085 1.12 0.29 1.04 20 rs6062314 62409713 ZBTB46 T 0.047 1.14 0.52 1.04 22 rs2283792 22131125 MAPK1 G 0.00036 1.12 0.23 1.04 22 rs140522 50971266 ODF3B T 0.0022 1.12 0.72 0.99

Source of variants: 3, except where specified: 4 a r2 = 0.15 with adjacent variant rs12466022, b No SNP with r2 > 0.6

68 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS

We next combined summary results from both MS and ALS datasets in a meta-analysis, looking for modest effects in each dataset that strengthen each other in the combined analysis. The combined analysis of both diseases included a total of 5 440 446 SNPs (Fig. 1). The genomic inflation factor (λs) was 1.033 for MS, 0.997 for ALS and 1.005 for the combined MS-ALS meta-analysis. In the meta-analysis, the HLA region reaches genome-wide significance, but this is driven by the MS component (P ALS with same direction of effect ≥0.01). One region, nearNPEPPS on chromosome 17 (rs2935183), reaches suggestive associa- tion levels of P < 5 × 10-7 but is once again driven by MS [P (MS) = 6.5 × 10-7; P (ALS) = 0.41]. Lastly, we investigated the possibility of an overlap of small susceptibility effects (polygenic score or ‘en masse’ effect). Therefore, we tested collectively SNPs that reached certain thresholds in the MS or ALS GWASs for association with ALS and MS, respectively. After correction for multiple testing, none of the models were significantly associated with disease (Tables 3 and 4), with the best model for each disease explaining only 0.05% of the phenotypic variance. To test whether the lack of association may have been affected by association results in the HLA region (which is known to be strongly associated with MS, but not with ALS), we repeated the polygenic analysis excluding SNPs in the HLA region (removing all SNPs on chromosome 6 between 29 and 33 Mb). This did not influence the results (Supplementary Material, Table S1).

Figure 1 5 Manhattan plots

Manhattan plots of (A) combined MS – ALS analysis and (B) an overlay of the individual components consisting of both diseases (blue: MS, red: ALS). The y-axis has been cut off at -logP = 10. Red and blue horizontal lines indicate genome- wide (P < 5 × 10-8) and suggestive (P < 5 × 10-7) evidence.

69 Table 3 Polygenic score based on MS data in ALS

Model P-value Number of Nagelkerke r2 corrected for baselinea SNPs <5E-8 0.820 75 5.4E-06 <5E-7 0.963 90 2.0E-07 <5E-6 0.987 114 0.0E+00 <5E-5 0.827 184 5.0E-06 <5E-4 0.880 633 2.4E-06 <5E-3 0.414 3454 6.9E-05 <0.05 0.775 22284 8.5E-06 <0.1 0.848 38861 3.8E-06 <0.2 0.986 66276 1.0E-07 <0.3 0.743 89109 1.1E-05 <0.4 0.459 108626 5.7E-05 <0.5 0.412 125558 7.0E-05 a Baseline: PC1-3, dummy coded cohorts

Table 4 Polygenic score based on ALS in MS

Model P-value Number of Nagelkerke r2 corrected for baselinea SNPs <5E-8 0.843 3 4.5E-06 <5E-7 0.785 4 8.4E-06 <5E-6 0.500 7 5.2E-05 <5E-5 0.452 49 6.4E-05 <5E-4 0.928 389 9.3E-07 <5E-3 0.306 3075 1.2E-04 <0.05 0.032 22315 5.2E-04 <0.1 0.050 38922 4.4E-04 <0.2 0.040 66738 4.8E-04 <0.3 0.057 89592 4.1E-04 <0.4 0.048 108839 4.4E-04 <0.5 0.074 125337 3.6E-04 a Baseline: PC1-5, dummy coded cohorts

DISCUSSION In this study, we have applied several statistical approaches to the investigation of shared susceptibility loci between the neurological diseases MS and ALS, which are both thought to involve inflammatory and neurodegenerative components1,17,18 and for which case reports and epidemiological studies have reported

70 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS co-occurrence within individuals or families.19-29 The strength of the study is that different statistical ap- proaches are consistent in demonstrating that the number of regions in the genome with evidence for an overlap in susceptibility between the two diseases is not more than expected by chance. Among 65 loci having previously been implicated in one disease or disease subgroup, only 5 show nominally significant association with the other disease and none survive correction for multiple testing. There was no significant enrichment for the same direction of effect in both diseases. In a combined analysis of both diseases, no region outside of the HLA reaches genome-wide significance and only one reaches suggestive association levels of P < 5 × 10-7. Moreover, for both these regions with evidence for association in both diseases, results appear driven by strong evidence in MS, despite sample sizes of similar magnitude for both diseases. Furthermore, the polygenic analysis demonstrates that it is unlikely that many common variants with effect sizes that are beyond the detection threshold for association are shared between the two diseases. This contrasts with other diseases where a polygenic risk score calculated for one disease is associated with related diseases, as in the example of schizophrenia and bipolar disorder.15 MS is a clinically heterogeneous disease, with the majority of patients (~80%) suffering from a bout onset form of the disease with relapses and remissions, possibly followed by secondary progres- sion, and the remaining 20% being characterized by progression from onset.1 It has been speculated that both forms represent a continuous spectrum of disease phenotypes with risk factors driving the balance 5 between inflammation and neurodegeneration.32 Genetic association studies have so far not provided evidence for a different pathogenesis of the two forms.3 On the contrary, HLA-DRB1*1501, the strongest risk factor in MS and especially immunological in nature, is shared between both bout onset and primary pro- gressive MS. In this study, there was no evidence for shared loci with the same direction of effect between ALS and primary progressive MS. A total of >50 common risk variants for MS have now been identified.3,4 There is a highly significant enrichment for immune system genes in this list, with only few variants having a potential neurological function.3 GWAS studies in ALS have seen limited success.8 This discrepancy in the number of common risk variants identified between immunological and other diseases has been suggested to reflect a history of selection and adaptation of variants influencing the immune system.33,34 Mutations in several genes cause familial forms of ALS, and it has been thought that less common (1 – 5%) or rare (<1%) variants play a role in sporadic forms of the disease as well.35 Similarly, first reports of less common and rare variants in MS are emerging.36,37 This category of variants, which are not well captured by current genome-wide associa- tion studies, may explain part of the heritability in MS and ALS that remains unaccounted for by common variants (‘missing heritability’), and potentially the shared neurodegenerative component. Next-generation sequencing offers a technology suited to address this hypothesis. It has recently been demonstrated that a large proportion of ALS is related to a GGGGCC hexanu- cleotide repeat expansion in intron 1 of C9orf7238,39, located in a region on chromosome 9p previously high- lighted in GWAS studies of ALS.7,8 We did not observe any association of the C9orf72 region with MS. This

71 is in line with the fact that no repeat expansions were observed in a cohort of 215 MS patients.25 Hence, C9orf72 variation does not appear to be a risk factor for MS. It has been suggested that MS can act as a modifier that increases the likelihood of C9orf72 expansions becoming penetrant and causing concurrent ALS25, although further investigation is required.40 In summary, the overlap of common variants between MS and other autoimmune disorders is not matched by a similar overlap between MS and other neurological disorders, such as ALS in this study. Whether less common or rare variants explain some of the shared neurodegenerative or neuroinflamma- tory aspects of both diseases cannot be addressed with the currently available datasets and remains to be examined with emerging technologies.

MATERIALS AND METHODS We used data from 6 datasets totaling 4088 MS patients and 7144 controls from a recent meta-analysis of MS genome-wide association studies.4 Imputation was performed using Beagle v3.1 and the 1000 Genomes Project (1000G) Phase I (a) reference panel (2010/11 data freeze, 2011/6 haplotypes), and analysis was performed as described previously using the postimputation probabilities and the first five principal components (PC) as covariates4, leading to association results for a total of 6 948 682 SNPs with INFO of >0.10 and a minor allele frequency of >0.01 in all 6 datasets. The ALS study population consists of 3 762 patients and 4 886 controls over 11 cohorts, for which details have been described previously.7,41 Imputation was performed using Beagle v.3.1.1. software with the 1000G CEU Aug 2010 reference panel. Analysis on dosage data including 3 PC led to association results for 12 249 385 SNPs. A/T and C/G SNPs were removed, and results from both datasets on 5 440 446 overlapping SNPs were combined using an inverse variance fixed-effects model as implemented in the PLINK software package.42 Power was >99% for OR of ≥1.2 and >80% for OR of ≥1.15 at a typical risk allele frequency of 30% and genome-wide significance P( < 5 × 10-8). Polygenic risk scores were calculated per individual to test the collective impact of SNPs that are associated with ALS on MS and vice versa. For each trait (MS and ALS), we first pruned the association re- sults of the GWAS by linkage disequilibrium (r2 = 0.1), preferentially keeping SNPs with lower P-values. We selected twelve sets of SNPs (models) based on their GWAS P-values (<5 × 10-8, <5 × 10-7, <5 × 10-6, <5 × 10-5, <5 × 10-4, <5 × 10-3, <0.05, <0.1, <0.2, <0.3, <0.4 and <0.5). The smallest model contains three SNPs, whereas the models of P < 0.5 contain >125 000 SNPs (Table 3). Next, we calculated a polygenic risk score in all individuals of the other GWAS by summing up the dosages of the risk alleles in each model, multiplied by the log-odds. We then tested the association between the risk score and the phenotype using logistic regression with the same number of PCs as used in the original analysis of each trait (ALS: PC1-3, MS: PC1-5) and dummy-coded cohorts as covariates. Nagelkerke r2 was calculated to test the variance explained by each model.43

72 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS

REFERENCES 1. Compston A, Coles A. Multiple sclerosis. Lancet 2008;372:1502-1517. 2. Willer CJ, Dyment DA, Cherny S, Ramagopalan SV, Herrera BM, Morrison KM, Sadovnick AD, Risch NJ, Ebers GC. A genome-wide scan in forty large pedigrees with multiple sclerosis. J Hum Genet 2007;52:955-962. 3. The International Multiple Sclerosis Genetics Consortium, The Welcome Trust Case Control Consortium 2. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 2011;476:214-219. 4. Patsopoulos NA, Bayer Pharma MS Genetics Working Group, Steering Committees of Studies Evaluating IFNb-1b and a CCR1-Antagonist, ANZgene Consortium, GeneMSA, The International Multiple Sclerosis Genetics Consortium, De Bakker PIW. Genome-wide meta-analysis identifies novel multiple sclerosis susceptibility loci. Ann Neurol 2011;70:897-912. 5. Ferraiuolo L, Kirby J, Grierson AJ, Sendtner M, Shaw PJ. Molecular pathways of motor neuron injury in amyotrophic lateral sclerosis. Nat Rev Neurol 2011;7:616-630. 6. Andersen PM, Al-Chalabi A. Clinical genetics of amyotrophic lateral sclerosis: what do we really know? Nat Rev Neurol 2011;7:603-615. 7. van Es MA, Veldink JH, Saris CG, Blauw HM, van Vught PW, Birve A, Lemmens R, Schelhaas HJ, Groen EJ, Huisman MH, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as 5 susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087. 8. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, Vance C, Johnson L, Veldink JH, van Es MA, van den Berg LH, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 9. Dunckley T, Huentelman MJ, Craig DW, Pearson JV, Szelinger S, Joshipura K, Halperin RF, Stamper C, Jensen KR, Letizia D, et al. Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med 2007;357:775-788. 10. van Es MA, van Vught PW, Blauw HM, Franke L, Saris CG, Van den Bosch L, de Jong SW, de Jong V, Baas F, van’t Slot R, et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet 2008;40:29-31. 11. van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, Andersen PM, Van Den Bosch L, de Jong SW, van‘t Slot R, Birve A, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol 2007;6:869-877. 12. Zhernakova A, van Diemen CC, Wijmenga C. Detecting shared pathogenesis from the shared genetics of immune-related diseases. Nat Rev Genet 2009;10:43-55. 13. Cotsapas C, Voight BF, Rossin E, Lage K, Neale BM, Wallace C, Abecasis GR, Barrett JC, Behrens T, Cho J, et al. Pervasive sharing of genetic effects in autoimmune disease.PLoS Genet 2011;7:e1002254. 14. Mohlke KL, Boehnke M, Abecasis GR. Metabolic and cardiovascular traits: an abundance of recently identified common genetic variants.Hum Mol Genet 2008;17:R102-R108. 15. The International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009;460:748-752.

73 16. Herz J, Zipp F, Siffrin V. Neurodegeneration in autoimmune CNS inflammation. Exp Neurol 2010;225:9-17. 17. Philips T, Robberecht W. Neuroinflammation in amyotrophic lateral sclerosis: role of glial activation in motor neuron disease. Lancet Neurol 2011;10:253-263. 18. Saresella M, Piancone F, Tortorella P, Marventano I, Gatti A, Caputo D, Lunetta C, Corbo M, Rovaris M, Clerici M. T helper-17 activation dominates the immunologic milieu of both amyotrophic lateral sclerosis and progressive multiple sclerosis. Clin Immunol 2013;148:79-88. 19. Borisow N, Meyer T, Paul F. Concomitant amyotrophic lateral sclerosis and paraclinical laboratory features of multiple sclerosis: coincidence or causal relationship? BMJ Case Rep 2013;doi: 10.1136/bcr-2012- 007975. 20. Trojsi F, Sagnelli A, Cirillo G, Piccirillo G, Femiano C, Izzo F, Monsurro MR, Tedeschi G. Amyotrophic lateral sclerosis and multiple sclerosis overlap: a case report. Case Rep Med 2012;2012:324685. 21. Li G, Esiri MM, Ansorge O, DeLuca GC. Concurrent multiple sclerosis and amyotrophic lateral sclerosis: where inflammation and neurodegeneration meet?J Neuroinflammation 2012;9:20. 22. Allen JA, Stein R, Baker RA, Royden Jones H. Muscle atrophy associated with multiple sclerosis: a benign condition or the onset of amyotrophic lateral sclerosis? J Clin Neurosci 2008;15:706-708. 23. Dynes GJ, Schwimer CJ, Staugaitis SM, Doyle JJ, Hays AP, Mitsumoto H. Amyotrophic lateral sclerosis with multiple sclerosis: a clinical and pathological report. Amyotroph Lateral Scler Other Motor Neuron Disord 2000;1:349-353. 24. Hader WJ, Rpzdilsky B, Nair CP. The concurrence of multiple sclerosis and amyotrophic lateral sclerosis. Can J Neurol Sci 1986;13:66-69. 25. Ismail A, Cooper-Knock J, Highley JR, Milano A, Kirby J, Goodall E, Lowe J, Scott I, Constantinescu CS, Walters SJ, et al. Concurrence of multiple sclerosis and amyotrophic lateral sclerosis in patients with hexanucleotide repeat expansions of C9ORF72. J Neurol Neurosurg Psychiatry 2012;84:79-87. 26. Turner MR, Goldacre R, Ramagopalan S, Talbot K, Goldacre MJ. Autoimmune disease preceding amyotrophic lateral sclerosis: an epidemiologic study. Neurology 2013;81:1222-1225. 27. Hemminki, K Li X, Sundquist J, Hillert J, Sundquist K. Risk for multiple sclerosis in relatives and spouses of patients diagnosed with autoimmune and related conditions. Neurogenetics 2009;10:5-11. 28. Etemadifar M, Abtahi SH, Akbari M, Maghzi AH. Multiple sclerosis and amyotrophic lateral sclerosis: is there a link? Mult Scler 2012;18:902-904. 29. Hemminki, K Li X, Sundquist J, Sundquist K. Familial risks for amyotrophic lateral sclerosis and autoimmune diseases. Neurogenetics 2009;10:111-116. 30. Landtblom AM, Riise T, Boiko A, Söderfeldt B. Distribution of multiple sclerosis in Sweden based on mortality and disability compensation statistics. Neuroepidemiology 2002;21:167-179. 31. Bostrom I, Riise T, Landtblom AM. Mortality statistics for multiple sclerosis and amyotrophic lateral sclerosis in Sweden. Neuroepidemiology 2012;38:245-249. 32. Hensiek AE, Seaman SR, Barcellos LF, Oturai A, Eraksoi M, Cocco E, Vecsei L, Stewart G, Dubois B, Bellman-Strobl J, et al. Familial effects on the clinical course of multiple sclerosis.Neurology 2007;68:376- 383. 33. Corona E, Dudley JT, Butte AJ. Extreme evolutionary disparities seen in positive selection across seven

74 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS

complex diseases. PLoS One 2010;5:e12236. 34. Casto AM, Feldman MW. Genome-wide association study SNPs in the human genome diversity project populations: does selection affect unlinked SNPs with shared trait associations?PLoS Genet 2011;7:e1001266. 35. Dion PA, Daoud H, Rouleau GA. Genetics of motor neuron disorders: new insights into pathogenic mechanisms. Nat Rev Genet 2009;10:769-782. 36. De Jager PL, Jia X, Wang J, de Bakker PI, Ottoboni L, Aggarwal NT, Piccio L, Raychaudhuri S, Tran D, Aubin C, et al. Meta-analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nat Genet 2009;41:776-782. 37. Goris A, Fockaert N, Cosemans L, Clysters K, Nagels G, Boonen S, Thijs V, Robberecht W, Dubois B. TN FRSF1A coding variants in multiple sclerosis. J Neuroimmunol 2011;235:110-112. 38. DeJesus-Hernandez M, Mackenzie IR, Boeve BF, Boxer AL, Baker M, Rutherford NJ, Nicholson AM, Finch NA, Flynn H, Adamson J, et al. Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 2011;72:245-256. 39. Renton AE, Majounie E, Waite A, Simon-Sanchez J, Rollinson S, Gibbs JR, Schymick JC, Laaksovirta H, van Swieten JC, Myllykangas L, et al. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011;72:257-268. 40. Van Doormaal PTC, Gallo A, van Rheenen W, Veldink JH, van Es MA, Van den Berg LH. Amyotrophic 5 lateral sclerosis is not linked to multiple sclerosis in a population based study. J Neurol Neurosurg Psychiatry 2013;84:940-941. 41. Chio A, Schymick JC, Restagno G, Scholz SW, Lombardo F, Lai SL, Mora G, Fung HC, Britton A, Arepalli S, et al. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2009;18:1524-1532. 42. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. 43. Nagelkerke NJD. A note on a general definition of the coefficient of determination. Biometrika 1991;78:691-692.

75 SUPPLEMENTARY INFORMATION

Supplementary Table 1 Polygenic risk model excluding the HLA region

A. Polygenic score based on MS data in ALS Model P-value # SNPs Nagelkerke r2 corrected for baselinea <5E-8 0.341 4 9.4E-05 <5E-7 0.446 8 6.0E-05 <5E-6 0.481 23 5.2E-05 <5E-5 0.512 87 4.5E-05 <5E-4 0.727 518 1.3E-05 <5E-3 0.405 3314 7.2E-05 <0.05 0.623 22116 2.5E-05 <0.1 0.732 29682 1.2E-05 <0.2 0.928 66084 9.0E-07 <0.3 0.652 88914 2.1E-05 <0.4 0.380 108424 8.0E-05 <0.5 0.338 125353 9.5E-05 a Baseline: PC1-3, dummy coded cohorts.

B. Polygenic score based on ALS data in MS Model P-value # SNPs Nagelkerke r2 corrected for baselineb <5E-8 0.843 3 4.5E-06 <5E-7 0.785 4 8.4E-06 <5E-6 0.500 7 5.2E-05 <5E-5 0.452 49 6.4E-05 <5E-4 0.882 388 2.5E-06 <5E-3 0.597 3066 3.2E-05 <0.05 0.029 22274 5.4E-04 <0.1 0.050 38850 4.4E-04 <0.2 0.038 66630 4.9E-04 <0.3 0.052 89464 4.3E-04 <0.4 0.041 108697 4.7E-04 <0.5 0.061 125179 4.0E-04 b Baseline: PC1-5, dummy coded cohorts.

INTERNATIONAL MULTIPLE SCLEROSIS GENETIC CONSORTIUM (IMSGC) MEMBERSHIP Lisa Barcellos, David Booth, Jacob L McCauley, Manuel Comabella, Alastair Compston, Sandra D Alfonso, Philip De Jager, Bertrand Fontaine, An Goris, David Hafler, Jonathan Haines, Hanne F. Harbo, Stephen L. Hauser, Clive Hawkins, Bernhard Hemmer, Jan Hillert, Adrian Ivinson, Ingrid Kockum, Roland Martin, Filippo Martinelli Boneschi, Jorge Oksenberg, Tomas Olsson, Annette Oturai, Nikolaos Patsopoulos, Margaret Pericak-Vance, Janna Saarela, Stephen Sawcer, Anne Spurkland, Graeme Stewart, Frauke Zipp

76 NO EVIDENCE FOR SHARED GENETIC BASIS OF COMMON VARIANTS IN MS AND ALS

THE AUSTRALIA AND NEW ZEALAND MS GENETICS CONSORTIUM (ANZGENE) MEMBERSHIP Rodney J Scott, Jeannette Lechner-Scott, Pablo Moscato, David R Booth, Graeme J Stewart, Robert N Heard, Deborah Mason, Lyn Griffiths, Simon Broadley, Matthew A Brown, Mark Slee, Simon J Foote, Jim Stankovich, Bruce V Taylor, James Wiley, Melanie Bahlo, Victoria Perreau, Judith Field, Hel- mut Butzkueven, Trevor J Kilpatrick, Justin Rubio, Mark Marriott, William M Carroll, Allan G Kermode

5

77

6

C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

ANNALS OF NEUROLOGY. 2014;76:120-33

Frank P Diekstra, Vivianna M Van Deerlin,† John C van Swieten,† Ammar Al-Chalabi, Albert C Ludolph, Jochen H Weishaupt, Orla Hardiman, John E Landers, Robert H Brown Jr, Michael A van Es, R Jeroen Pasterkamp, Max Koppers, Peter M Andersen, Karol Estrada, Fernando Rivadeneira, Albert Hofman, Andre G Uitterlinden, Philip van Damme, Judith Melki, Vincent Meininger, Aleksey Shatunov, Christopher E Shaw, P Nigel Leigh, Pamela J Shaw, Karen E Morrison, Isabella Fogh, Adriano Chio, Bryan J Traynor, David Czell, Markus Weber, Peter Heutink,‡ Paul I W de Bakker, Vincenzo Silani,‡ Wim Robberecht, Leonard H van den Berg,* Jan H Veldink,*

These authors were joint senior authors on this work * on behalf of the International Collaboration for Frontotemporal † Lobar Degeneration; see Supplementary Table 5 for full list of contributors on behalf of the SLAGEN Consortium; see Supplementary Table 5 ‡ for full list of contributors ABSTRACT OBJECTIVE: Substantial clinical, pathological and genetic overlap exists between amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). TDP-43 inclusions have been found in both ALS and FTD cases (FTD-TDP). Recently, a repeat expansion in C9orf72 was identified as the causal var- iant in a proportion of ALS and FTD cases. We sought to identify additional evidence for a common genetic basis for the spectrum of ALS-FTD.

METHODS: We used published GWAS data of 4,377 ALS patients and 13,017 controls and 435 pathology-proven FTD-TDP cases and 1,414 controls for genotype imputation. Data were analyzed in a joint meta-analysis, by replicating topmost associated hits of one disease in the other, and by us- ing a conservative rank products analysis, allocating equal weight to ALS and FTD-TDP sample sizes.

RESULTS: Meta-analysis identified 19 genome-wide significant single nucleotide polymorphisms (SNPs) at C9orf72 on chromosome 9p21.2 (lowest p=2.6×10-12) and one SNP in UNC13A on chro- mosome 19p13.11 (p=1.0×10-11) as shared susceptibility loci for ALS and FTD-TDP. Conditioning on the 9p21.2 genotype increased statistical significance at UNC13A. A third signal, on chromosome 8q24.13 at the SPG8 locus coding for strumpellin, (p=3.91×10-7) was replicated in an independent cohort of 4,056 ALS patients and 3,958 controls (p=0.026; combined analysis p=1.01×10-7).

INTERPRETATION: We identified common genetic variants atC9orf72 , but in addition in UNC13A that are shared between ALS and FTD. UNC13A provides a novel link between ALS and FTD-TDP, and identifies changes in release and synaptic function as a converging mechanism in the pathogenesis of ALS and FTD-TDP.

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterized by progressive muscle weakness due to the loss of motor neurons in both brain and spinal cord. No cure exists and disease etiology has not yet been fully elucidated. Important overlap exists with frontotemporal dementia (FTD), which is characterized by changes in cognition, behavior and language. Clinically, approximately 5-15% of patients with ALS have FTD, while about 3-14% of FTD patients also fulfill the criteria for ALS.1,2 Neuropathologically, the majority of FTD cases can be divided in two subtypes, characterized by cellular inclusions of either tau (FTD-tau) or TDP-43 (FTD-TDP). TDP-43 inclusions have been found in neurons of both ALS and FTD-TDP patients.3 Lastly, substantial genetic overlap between ALS and FTD has been reported. Linkage studies identified a locus of several megabases on chromosome 9p21 in families of patients with both ALS and FTD.4-6 Previous genome-wide association studies (GWAS) of non-familial ALS

80 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS helped fine-map this region, and recently, a hexanucleotide repeat expansion in C9orf72 was discovered in this region.7-11 The C9orf72 repeat expansion is present in approximately 6% of sporadic ALS and sporadic FTD patients, and in up to 37% and 25% of familial ALS and FTD cases, respectively.12 Additionally, muta- tions in VCP have been implicated in both ALS patients and in FTD.13 Furthermore, mutations in the gene for TDP-43 (TARDBP) have been found in ALS and FTD with motor neuron degeneration (FTD-MND), but are rarely present in FTD-TDP cases without motor neuron symptoms.14,15 The majority of gene mutations have been found in familial cases of ALS and FTD, but these mutations are less frequent in cases without a positive family history.11,12,16 Meta-analysis of GWAS data is a powerful tool to discover new susceptibility loci for non-familial disease.17 The association signals from a GWAS may represent common variants acting as ‘genetic risk factors’, but may also form a proxy for rare, moderately penetrant genetic variants, such as the repeat expansion in C9orf72.7,9 The discovery of the C9orf72 repeat expansion has, additionally, shown that intronic, non-coding variation may be causal to disease. Previously, the most recent and largest GWAS of sporadic ALS identified the locus on chromosome 9p21.2 (comprising C9orf72) and UNC13A as susceptibility loci.10,11 Recently, the first GWAS of FTD-TDP patients has been published, identifying three common variants in TMEM106B associated with susceptibility to sporadic FTD.16 The association with TMEM106B variants has now been replicated in independent co- horts including FTD-TDP patients.18,19 Both ALS and FTD may form parts of a spectrum of neurodegenerative disease. This spectrum ranges from pure motor ALS, to ALS with mild cognitive impairment, to FTD-MND, and ultimately, to pure FTD without motor neuron symptoms.20 In the present study, we sought to identify a common genetic basis for this spectrum of neurodegenerative disease. Therefore, we conducted a meta-analysis of all available 6 GWAS data in ALS and TDP-43 positive FTD aimed at the discovery of additional common variants that would affect susceptibility to both neurodegenerative diseases.

METHODS SUBJECTS ALS cohorts were derived from all available previously published GWAS of ALS patients.10,11 We included 16 cohorts of Caucasian sporadic ALS patients (n = 4,638) and/or unaffected controls (n = 14,038) from six European countries and the USA for whom genome-wide genotype data were available. Previous replication cohorts with selected SNP sets (for example obtained by TaqMan genotyping) could not be included. For all cohorts, the diagnosis of probable or definite ALS was made according to the revised El Escorial criteria.21 We obtained raw genotype data for 658 individuals that were originally genotyped for the FTD- TDP GWAS, and were recruited from 11 countries in Europe, USA, Canada and Australia.16 In the original publication, 598 cases with FTD-TDP pathology matched the inclusion criteria, of which 515 were used for analysis. For the present study, we only included cases with FTD-TDP confirmed by TDP-43 immunohis- tochemistry, a single proband per pedigree, and only individuals of European descent. We excluded cases

81 with mutations in GRN or VCP, resulting in 453 FTD-TDP cases that remained for further quality control. Clinical data on the presence of motor neuron symptoms were recorded. The Wellcome Trust Case Control Consortium (WTCCC) 1958 Birth Cohort was used for population controls.16 For replication purposes, we collected genomic DNA from a total of 2,218 sporadic ALS patients and 2,261 unaffected controls from The Netherlands, Germany, Sweden and Switzerland. In addition, in silico genotypes were obtained from Illumina beadchip data for 1,838 sporadic ALS patients and 1,697 controls from Italy. Dutch patients were recruited by neuromuscular centers at the University Medical Center Utrecht, the Radboud University Nijmegen Medical Center, and the Academic Medical Center Amsterdam, as part of an ongoing population-based study of ALS in The Netherlands. Unrelated control samples without any neu- romuscular disease were matched for age and gender. Swedish samples were included from the Umeå Uni- versity ALS Clinic that had not yet participated in previous GWAS included in the present meta-analysis. For the Swiss stratum, patients were recruited at Kantonsspital St. Gallen. German ALS patients were recruited through Ulm University Hospital and Charité University Hospital, Berlin. Control samples were unrelated individuals, free of any neuromuscular disease. Italian ALS patients were included through different Italians ALS centers as part of the Italian SLAGEN Consortium. Controls consisted of Italian healthy individuals who did not have a personal or family history for neurodegenerative diseases. For all strata, patients with ALS fulfilled the revised El Escorial Criteria for possible, probable, or definite ALS. Cases with a family history of ALS or non-Caucasian descent were excluded. As the discovery ALS and FTD GWAS samples include individuals with C9orf72 repeat expansions (no complete data availa- ble), we did not exclude C9orf72 repeat expansion carriers from the replication sets. All participants gave written informed consent and approval was obtained from the local institu- tional review boards. More detailed information on ALS or FTD subject selection methods has been pub- lished previously and can be found in Supplementary Table 1.10,11,16

GENOTYPES AND QUALITY CONTROL For each cohort, raw Illumina beadchip genotype data were obtained. The following quality control meas- ures were applied to each cohort separately. We removed A/T and C/G SNPs in order to prevent allele swaps, SNPs with a minor allele frequency (MAF) < 5%, a genotyping call rate < 95%, or with deviation from Hardy-Weinberg equilibrium (HWE) in controls (p < 0.001). Samples with missing phenotype data, a genotyping call rate < 95%, high (inbreeding coefficientF > 0.05) or low (F < -0.025) heterozygosity rates, or where the clinically reported gender did not match the genotypic gender (based on chromosome X markers), were removed. SNP identifiers and positions were mapped to dbSNP 126 and NCBI genome build 36. For the WTCCC 2 cohort, markers and samples listed for exclusion (as provided by the WTCCC), and samples previously included in the WTCCC 1 cohort, were removed. Cohorts consisting of control samples only were merged with cohorts that included affected patients from the same country, ultimately

82 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS forming twelve balanced strata for ALS and one for FTD. Per stratum, we excluded SNPs with differing missing rates between cases and controls (p < 1×10-3), or SNPs with differing missing rates between flanking haplotypes (p < 1×10-10). For population stratification analysis purposes, strata were merged into a separate dataset containing only SNPs common to all strata. Population substructure was determined by using principal components analysis in EIGENSTRAT, also incorporating HapMap 3 release 2 samples.22 After the removal of population outliers (based on deviation from the Utah residents with ancestry from northern and western Europe (CEU) + Toscans in Italy (TSI) population cluster in a plot of the first two principal components), duplicate (PI_HAT value > 0.9), and related samples (PI_HAT > 0.2), new princi- pal components were calculated. For the ALS and FTD meta-analysis, we removed duplicate and related individuals across disease datasets. See Supplementary Table 2 for Hardy-Weinberg equilibrium (HWE) p values and call rates for significantly associated SNPs per stratum. For the replication of association with the chromosome 8q24.13 locus, we used KASPar (KBiosci- ence) and TaqMan (Applied Biosystems) assays to determine rs13268726 and rs12546767 genotypes in the replication set, according to the manufacturer’s protocols. We used an ABI Prism 7900HT analyzer (Applied Biosystems) and SDS v2.3 software (Applied Biosystems) to determine genotype clusters, and outliers were excluded from further analyses. For the Italian SLAGEN cohort, in silico genotypes were ob- tained from Illumina Human-660W Quad Beadchips for rs12546767, and rs13268726 genotypes were imputed using IMPUTE v2 and HapMap3 release 2 and 1000 Genomes pilot reference panels.

STATISTICAL ANALYSIS 6 Because of the use of many different genotyping platforms, and a relatively small number of markers with genotypes in all strata, we used genome-wide SNP imputation to extend genome-wide coverage and to increase comparability.23 Imputation was carried out using IMPUTE2 v2.1.2 in genomic chunks of 5 Mb, leaving all options at the program’s defaults. We preferred the HapMap3r2 CEU+TSI reference (~1.4M markers) as a reference panel, because of the relatively large number of reference haplotypes (n = 410). We, additionally, imputed using the HapMap2 reference (120 haplotypes, ~2.5M markers), to determine if we would not miss important associations compared to the HapMap3r2 reference. Imputation using the HapMap2 reference did not yield additional significant results (data not shown). Imputed genotypes were stored as continuous allele dosage data, which are continuous numerical values indicating the estimated number of minor alleles (ranging from 0 to 2), thus incorporating a measure of imputation uncertainty. Associations between genotypes and disease susceptibility, were tested in logistic regression mod- els for each of the strata in PLINK. Gender and principal components (PCs) that were strongly (p < 1×10-5) associated with disease status were included as covariates in the logistic regression analyses (seven PCs for ALS, two PCs for FTD). Results from each of the strata were joined in a fixed effect inverse variance meta-analysis in PLINK, both per disease (ALS or FTD) separately, and combined.24,25 See Supplementary

83 Table 2 for allele and genotype frequencies per stratum for the topmost associated SNPs. We calculated genotypic odds ratios (ORgeno) for top-associated markers in heterozygotes (1 minor allele vs. 2 major allele carriers) and homozygotes (2 minor allele vs. 2 major allele carriers) using logistic regression, and incorpo- rating the same covariates as were used in the main disease susceptibility analyses.

We used the GCTA software package to calculate trait heritability estimates for ALS and FTD-TDP co- horts, as well as to estimate the cross-traits heritability, using a bivariate restricted maximum likelihood analysis, as has been described previously.26,27 Imputed dosage data were used as input, excluding SNPs with a poor imputation quality score (R2 < 0.3) and minor allele frequency < 0.01. The first seven principal components calculated from a combined dataset of ALS and FTD were included as covariates. Additionally, we performed a conditional and joint multiple-SNP analysis of the ALS and FTD meta-analysis results in order to determine possible independent association signals within a genomic locus reaching genome-wide significance.28 The conditional and joint analyses were carried out after converting imputed dosage data to hard-called genotypes (as required by the GCTA software). As an additional approach, in order to determine SNPs with shared susceptibility to both ALS and FTD, we selected SNPs with p < 1×10-4 from the above ALS meta-analysis and used the FTD-TDP data to replicate the associations with these SNPs. We used linkage disequilibrium (LD)-based clumping of SNPs in PLINK in order to cluster multiple genotyped and imputed SNPs within a region of strong LD, thus determin- ing independent loci.25 Per clump, we looked up p values of SNPs from the above logistic regression results in the FTD analysis. We applied a Bonferroni multiple-testing correction for the number of independent loci (clumps) tested. Subsequently, SNPs with p values below the threshold of 1×10-3 in the FTD analysis were selected for replication in the ALS data. For the selection of SNPs from the FTD analysis, we used a less stringent p value threshold (p < 1×10-3) in order to avoid the omission of false-negative associations due to limited statistical power of the FTD-TDP data. Furthermore, as sample sizes differ substantially between the joined ALS strata and the FTD-TDP stratum, we would be almost exclusively measuring the GWAS signal from the ALS patients. We, therefore, used a conservative rank products (RP) analysis to compare results from both ALS and FTD.29 The rank products method originates from the analysis of multiple expression micro-array experiments, and provides a non-parametric test that is independent of sample size. For the RP analysis, we ranked SNPs by increas- ing p value for each disease (ALS or FTD). Only SNPs with the same direction of effect were included. For each SNP we calculated the RP. Ultimately, SNPs were sorted by increasing rank product, and a permuta- tion test was used to determine statistical significance for each RP. In order to obtain empirical p values, we permuted the ranks of each SNP 100,000,000 times and counted the number of times the permuted RP was equal to or higher than the observed RP. SNPs with an empirical p value < 5×10-8 were considered to be significantly top-ranked for both diseases.

84 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

Figure 1

6

Manhattan plots for association results obtained from (A) meta-analysis of amyotrophic lateral sclerosis (ALS) strata, (B) analysis of the frontotemporal dementia (FTD) stratum, (C) joint meta-analysis of both ALS and FTD strata, and (D) joint meta-analysis of both ALS and FTD strata after the removal of 99 FTD cases with motor neuron disease (MND) symptoms.

Each dot represents a single nucleotide polymorphism; -log10 p values are shown on the y-axis, and chromosomal positi- ons on the x-axis. Chromosomes are numbered along the x-axis and are designated by changing colors. The threshold for genome-wide significance (p < 5 × 10-8) is indicated by a dotted line. Genome-wide significant loci, the 8q24.13 locus, and for the FTD analysis the previously associated TMEM106B locus are highlighted.

85 REPLICATION OF ASSOCIATION WITH LOCUS 8q24.13 In the replication set of sporadic ALS patients and unaffected controls, for each stratum, association be- tween SNP and disease status was tested in a logistic regression model corrected for gender. Subsequent- ly, results were analyzed in a weighted inverse variance meta-analysis in PLINK.

RESULTS After quality control, there were twelve ALS strata (4,377 cases, 13,017 controls), and one FTD stratum (435 cases, 1,414 controls). For some cohorts, the total number of cases and controls differed from the

original publications, due to different quality control methods. Genomic inflation factors (λGC) ranged from 1.01 – 1.07, indicating adequate quality control and correction for population stratification (Supplemen- tary Table 1). Associations did not reach genome-wide significance (p < 5×10-8) in any of the 13 strata separately.

Figure 2

Forest plots of top-associated single nucleotide polymorphisms (SNPs) in the amyotrophic lateral sclerosis (ALS) and fronto- temporal dementia (FTD) meta-analysis. Forest plots for top SNPs near (A) C9orf72 (rs3849943), (B) UNC13A (rs12608932), and (C) locus chromosome 8q24.13 (rs12546767) are shown. Strata are coded as NL1, NL2, et cetera (see Supplementary Table 1 for details on strata naming). Association test results between genotype and disease are shown for each stratum, for each disease separately, and for the combined analysis of ALS and FTD. In addition, for rs12546767, association test results for ALS replication strata are shown. Meta-analysis results are shown for a fixed effect model. Box sizes are relative to stratum sample sizes. Horizontal lines indicate 95% confidence intervals.

86 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

ANALYSIS OF ALS AND FTD SEPARATELY First, we inspected a meta-analysis of the ALS strata. Consistent with previous findings, we found genome-wide significant hits nearC9orf72 on chromosome 9p21.2 (top SNP rs3849943) and in gene UNC13A (top SNP rs12608932) on chromosome 19p13.11 (Fig 1 and 2; Table 1).10,11

Table 1 Top association results for independent loci containing SNPs, per disease and in the joint meta-analysis

Chr Locus n SNPs Top SNP Minor MAF ALS analysis FTD analysis Meta-analysis allele OR p OR p OR p -10 -4 -12 9 9p21.2 29 rs3849943 C 0.24 1.22 5.48×10 1.38 5.53×10 1.24 2.60×10 -8 -6 -11 19 UNC13A 1 rs12608932 C 0.35 1.18 1.70×10 1.46 6.57×10 1.21 1.02×10 -6 -7 8 8q24.13 11 rs13268726 G 0.10 0.80 4.69×10 0.70 0.020 0.79 3.91×10 -7 -6 17 CENPV 1 rs7477 A 0.49 1.16 2.91×10 0.98 0.785 1.14 1.82×10 -6 -6 3 TXNDC6 8 rs7638688 A 0.29 0.85 1.20×10 0.97 0.757 0.87 2.66×10 -4 -4 -6 16 KIAA0513 3 rs16975170 T 0.12 1.17 2.27×10 1.53 5.99×10 1.20 4.20×10 -5 -6 5 FAM13B 1 rs9327807 C 0.18 0.85 1.24×10 0.87 0.196 0.85 5.25×10 -6 -6 7 7p21.1 2 rs10233425 C 0.01 1.88 7.46×10 1.44 0.391 1.83 6.15×10 -5 -6 10 BUB3 1 rs11248416 G 0.11 1.23 3.64×10 1.27 0.103 1.24 9.19×10 -5 -6 17 17p13.2 1 rs12950017 C 0.24 1.16 1.47×10 1.11 0.293 1.15 9.19×10 -5 Per locus, the number of SNPs with p < 1×10 is indicated, and association results for the SNP with the most significant p value (top SNP) are presented. The ALS analysis was based on 4,377 sporadic ALS patients and 13,017 controls. The FTD analysis was based on 435 sporadic FTD- TDP cases and 1,414 population controls. The meta-analysis comprises a total of 4,811 patients with either ALS or FTD, and 14,428 controls. Chr = chromosome; MAF = weighted minor allele frequency across all datasets; OR = odds ratio.

Subsequently, we analyzed the separate FTD-TDP stratum. We found no genome-wide significant associa- 6 tions, which was consistent with the results for patients without progranulin (GRN) mutations in the original publication (Fig 1).16 The exact association results differed minimally from the original publication, most probably due to the use of a partly different control cohort, and different methods for quality control and statistical analysis.

COMBINED ALS AND FTD ANALYSIS To examine common genetic variants that are shared in ALS and FTD-TDP we applied three complementary methods to avoid only picking up ALS effects, considering the imbalance in cohort size. First, all strata were joined into a single meta-analysis. Not only was the signal at 9p21.2 (C9orf72) greatly enhanced, but also at UNC13A (Fig 1). For rs3849943 at 9p21.2, the genotypic odds ratio (ORgeno) in heterozygotes is 1.25

-7 -5 (p = 3.19×10 ), in homozygotes the ORgeno is 1.19 (p = 6.76×10 ); while for rs12608932 in UNC13A,

-15 the heterozygote ORgeno and homozygote ORgeno are 1.12 (p = 0.014) and 1.29 (p = 3.33×10 ), respectively. Results for markers previously associated with sporadic ALS or FTD-TDP can be found in Supplementary Table 3. We did not identify any new genome-wide significant associations in the joint meta- analysis, although one new locus nearing the genome-wide significance threshold emerged at chromo- some 8q24.13 (Fig 1).

87 Using a restricted maximum likelihood analysis, we estimated trait heritability for both ALS and FTD co- horts separately, as well as the cross-traits heritability between ALS and FTD-TDP. We estimated a SNP heritability for ALS of 0.21 (standard error 0.02), while for FTD-TDP no reliable heritability estimate could be calculated due to lack of statistical power. The genetic correlation between ALS and FTD-TDP was mod- est, but significant (0.20, standard error 0.098,p = 0.012). The SNP-based coheritability was estimated at 0.02 (standard error 0.007). To further demonstrate shared susceptibility to both ALS and FTD-TDP for the C9orf72 and UNC13A loci, we selected top-associated SNPs from one disease and tried to replicate their association in the other disease. We selected 191 SNPs with p < 1×10-4 in 75 independent loci from the ALS meta-analysis, and looked up association results for these SNPs in the FTD analysis, applying a Bonferroni multiple-testing correction for the number of independent loci tested. We thus identified six SNPs in two loci UNC13A( and C9orf72) that

Table 2 Replication of top SNPs from ALS analysis in FTD data and vice versa

Top SNPs from ALS replicated in FTD Chr Locus SNP Minor allele ALS analysis FTD analysis OR p OR p p Bonferroni 19 UNC13A rs12608932 C 1.18 1.70×10-8 1.46 6.57×10-6 4.93×10-4 9 9p21.2 rs2453554 T 1.21 3.06×10-9 1.39 3.35×10-4 0.025 9 9p21.2 rs700791 A 1.22 1.51×10-9 1.39 4.21×10-4 0.032 9 9p21.2 rs3849942 T 1.22 9.12×10-10 1.38 5.41×10-4 0.041 9 9p21.2 rs3849943 C 1.22 5.48×10-10 1.38 5.53×10-4 0.041 9 9p21.2 rs774359 C 1.18 1.09×10-7 1.37 5.67×10-4 0.042 Top SNPs from FTD replicated in ALS FTD analysis ALS analysis 9 9p21.2 rs3849943 C 1.38 5.53×10-4 1.22 5.48×10-10 3.16×10-7 9 9p21.2 rs10967965 T 1.39 8.41×10-4 1.24 5.80×10-10 3.34×10-7 9 9p21.2 rs3849942 T 1.38 5.41×10-4 1.22 9.12×10-10 5.25×10-7 9 9p21.2 rs700791 A 1.39 4.21×10-4 1.22 1.51×10-9 8.71×10-7 9 9p21.2 rs17779457 G 1.36 8.35×10-4 1.21 2.93×10-9 1.69×10-6 9 9p21.2 rs2453554 T 1.39 3.35×10-4 1.21 3.06×10-9 1.76×10-6 19 UNC13A rs12608932 C 1.46 6.57×10-6 1.18 1.70×10-8 9.77×10-6 9 9p21.2 rs774359 C 1.37 5.67×10-4 1.18 1.09×10-7 6.30×10-5 Top SNPs from ALS replicated in FTD without MND signs Chr Locus SNP Minor allele ALS analysis FTD without MND signs analysis OR p OR p p Bonferroni 19 UNC13A rs12608932 C 1.18 1.70×10-8 1.39 4.52×10-4 0.034 Top SNPs from FTD without MND signs replicated in ALS FTD without MND signs analysis ALS analysis 19 UNC13A rs12608932 C 1.39 4.52×10-4 1.18 1.70×10-8 8.63×10-6 The upper part of the table shows SNPs with p < 1×10-4 in the ALS analysis and with p < 0.05 after Bonferroni correction for the number of independent loci (n = 191) tested in the FTD analysis. Conversely, the second part of the table shows SNPs with p < 1×10-3 in the FTD analysis and with p < 0.05 after Bonferroni correction for the number of independent loci (n = 658) tested in the ALS analysis. Subsequently, SNPs are shown with p < 1×10-4 in the ALS analysis, and with p < 0.05 after Bonferroni correction for the number of independent loci (n = 191) tested in the FTD-TDP without MND signs analysis. The lower part of the table shows SNPs with p < 1×10-3 in the FTD-TDP without MND signs analysis and with p < 0.05 after Bonferroni correction for the number of independent loci (n = 1,374) tested in the ALS analysis. Results are sorted by increasing Bonferroni-corrected p value. Chr = chromosome; OR = odds ratio; MND = motor neuron disease.

88 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS were significantly replicated in the FTD-TDP data (Table 2). Conversely, we selected 1,450 SNPs withp < 1×10-3 in 658 independent loci from the FTD analysis and replicated these SNPs in the ALS data. We, again, identified two loci (seven SNPs nearC9orf72 and one in UNC13A) that were significantly replicated in the ALS data (Table 2). Subsequently, we used a third approach to investigate whether the associations near C9orf72 and UNC13A were not exclusively driven by one of the disease cohorts. Because of the large sample size of the ALS strata, the combined meta-analysis signals might be largely driven by ALS patients. To take this into account, we used a conservative, sample size-independent, rank products (RP) analysis. This analysis identifies SNP associations whosep values that are ranked highest in both diseases, thus allocating equal weight to both ALS and FTD results. Supplementary Table 4 shows the top ten SNPs from this analysis, arranged according to increasing RP. The -log10 p values of SNPs with the lowest RPs are ranked highest in both diseases. Empirical p values for the rank products, as determined by 100,000,000 permutations, were highly significant for the top six SNPs p( < 5×10-8). SNP rs12608932 in UNC13A had the strongest signal in both diseases. The following nine most significant SNPs were all in the C9orf72 locus. Fig 3 shows a more visual representation of the results from both studies; associations present in both ALS and FTD strata clearly stand out from the disease-specific and non-significant associations.

Figure 3

6

Association results in amyotrophic lateral sclerosis (ALS) versus frontotemporal dementia (FTD). Plot shows -log10 p

values of single nucleotide polymorphism (SNP) associations in ALS (x-axis) versus -log10 p values in FTD (y-axis). SNP rs12608932 is marked by UNC13A, and a gray-colored area designates a cluster of SNPs located within the chromo-

some 9p21.2 locus. Only SNPs with the same direction of effect have been included in the figure. Based on -log10 p values, rs12608932 was ranked highest in both diseases. chr = chromosome.

89 Our study most likely includes a substantial number of C9orf72 repeat expansion carriers, however, we are not able to retrieve repeat expansion status for all patients included. As a proxy for C9orf72 repeat expan- sion status, we adjusted association tests for rs3849943 genotype (the top SNP at the 9p21.2 locus). The association at UNC13A was not dependent on rs3849943 SNP genotype in ALS (OR 1.18, p = 8.64×10-9), FTD (OR 1.45, p = 9.09×10-6), and the ALS-FTD meta-analysis (OR 1.21, p = 5.68×10-12). Instead, the statistical significance of the association with UNC13A increased. Similarly, for the 8q24.13 locus (top SNP rs13268726), association results that were corrected for rs3849943 genotype in ALS (OR 0.80, p = 6.04×10-6), FTD (OR 0.71, p = 0.029), and the ALS-FTD meta-analysis (OR 0.79, p = 6.47×10-7) were very similar to the unadjusted results (Table 1). Furthermore, we performed a systematic, genome-wide conditional analysis to identify possible additional, independent association signals in our meta-analysis of ALS and FTD-TDP. The conditional analysis did not yield any additional independent association signals at the chromosome 9p21.2 locus, gene UNC13A and chromosome 8q24.13. For each locus, association sig- nals were driven by the SNP with the lowest p value. Adjusted, joint p values, did not differ notably from the original p values, indeed indicating these three SNPs represent true independent signals (data not shown).

ANALYSES INCLUDING ONLY FTD-TDP CASES WITHOUT MOTOR NEURON SYMPTOMS As noted previously, a substantial proportion of FTD-TDP patients also have motor neuron symptoms. In order to determine if association signals from the FTD-TDP cohort were not solely driven by patients with motor neuron symptoms, we repeated the ALS and FTD meta-analysis after removing all FTD-TDP patients with signs of motor neuron disease (n = 99) from the FTD stratum. Of course, statistical power was attenu- ated for the set of FTD-only patients. However, signals at C9orf72 and UNC13A were still enhanced by adding the FTD-only cases into the meta-analysis (Fig 1). Also, the rank products analysis showed consistent ranking of the top 6 markers (Supplementary Table 4).

NEW SIGNAL ON CHROMOSOME 8q24.13 In the combined ALS and FTD meta-analysis, there was one new signal on chromosome 8q24.13 with top SNPs rs13268726 (imputed, OR 0.79, p = 3.91×10-7), and rs12546767 (genotyped, OR 0.80, p = 4.68×10-7) and mapping to a region with strong LD comprising genes SQLE, KIAA0196, and NSMCE2 (Fig 4). Both top SNPs are in strong LD (r2 = 0.95, D’ = 0.975). Additionally, in the analysis of FTD-TDP cases without motor neuron symptoms, we found consistent associations for rs13268726 (OR 0.69, p = 0.029) and rs12546767 (OR 0.67, p = 0.022) compared to the full FTD stratum (Table 3). We followed up rs13268726 (in SQLE) and rs12546767 (in KIAA0196) in a replication set of 4,056 sporadic ALS patients and 3,958 controls. We replicated the associations between the two SNPs in this locus and ALS susceptibility, with the lowest p value for rs12546767 (OR 0.88, p = 0.026) in gene KIAA0196 (Fig 2). Joint analysis of both genome-wide and replication data reached a p value of 1.01×10-7 for this locus, but did not reach genome-wide significance p( < 5×10-8). Detailed results are shown in Table 3.

90 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

Figure 4

6

Regional association plot for association signals at chromosome 8q24.13 in the combined amyotrophic lateral

sclerosis and frontotemporal dementia meta-analysis. The y-axis presents -log10 p values for each single nucleotide polymorphism (SNP); the x-axis shows the chromosomal position in megabases. Dot colors indicate the amount of linkage disequilibrium (r2) with the index SNP rs12546767 (which is the strongest associated genotyped SNP in the locus). Both imputed and genotyped SNPs are shown. The lower panel shows gene positions; arrows indicate the transcriptional direction.

DISCUSSION In the present study, we conducted a large genome-wide meta-analysis in a combined cohort of nearly 5,000 patients and over 14,000 controls. We aimed at finding new genetic variants that would affect susceptibility to both sporadic ALS and TDP-43 positive FTD. With three complementary methods, to avoid only picking up effects of the larger ALS cohort, we identified not only the known 9p.21 locus including C9orf72, but also the UNC13A locus as shared between both neurodegenerative diseases. By combining results from ALS and FTD datasets in a joint meta-analysis, we found a strong increase of association signals at both loci. Furthermore, by replication of top-associated SNPs from one disease in the other and

91 Table 3 Association results for top SNPs at the chromosome 8q24.13 locus

SNP Gene Minor MAF ALS analysis FTD analysis ALS + FTD meta- Replication Meta-analysis + allele analysis ALS replication combined OR p OR p OR p OR p OR p rs13268726 SQLE G 0.10 0.80 4.69×10-6 0.79 0.020 0.79 3.91×10-7 0.89 0.044 0.82 1.85×10-7 rs12546767 KIAA0196 C 0.10 0.80 6.63×10-6 0.69 0.015 0.79 4.78×10-7 0.88 0.026 0.82 1.01×10-7 Association results are shown for a weighted inverse-variance meta-analysis of 12 sporadic ALS cohorts (4377 ALS cases, 13017 controls), for logistic regression analysis in the FTD-TDP cohort (435 FTD-TDP cases, 1414 controls), for the combined ALS-FTD meta-analysis (4811 cases, 14428 controls), and for an independent replication cohort of sporadic ALS (4056 ALS, 3958 controls). In the righthand column, results for a combined analysis of both the ALS-FTD meta-analysis data and replication cohort are shown (8867 cases, 18386 controls). MAF = weighted minor allele frequency across all datasets; OR = odds ratio.

vice versa, we found both the C9orf72 and UNC13A loci to be significantly associated in the ALS as well as in the FTD analysis. We also showed that signals at both loci are not solely driven by the relatively large num- ber of ALS patients. Furthermore, by repeating the meta-analysis selecting only FTD patients without motor neuron disease, we demonstrated that the signals from the meta-analysis are not unique to individuals with motor neuron symptoms in both groups. Lastly, we found a modest, but significant genetic correlation between ALS and FTD-TDP. The observation that the UNC13A association signal is shared between ALS and FTD-TDP co- horts, highlights UNC13A not only as susceptibility gene in ALS, but also as a susceptibility factor for FTD- TDP, further corroborating the role of UNC13A in neuronal degeneration. Previously we have shown that rs12608932 acts as a modifier of survival in ALS, which was recently replicated in an Italian cohort of ALS patients.30,31 UNC13A, therefore, poses an interesting therapeutic target. The protein encoded by UNC13A is a member of a family of presynaptic proteins present throughout the nervous system and involved in the priming of presynaptic vesicles containing before their release.32 Aberrant function of UNC13A disrupts the of excitatory and inhibitory neurotransmitters. This not only affects the biochemical communication between neurons, but also triggers structural changes in existing neu- ronal networks.32 Furthermore, changes in the release of neurotransmitters as a consequence of defects in UNC13A are in line with the glutamate excitotoxicity hypothesis, previously implicated in motor neuron degeneration. Thus, our findings implicate changes in neurotransmitter release and synaptic function as a converging mechanism in the pathogenesis of ALS and FTD.33 The highly significant signal generated by 19 SNPs at the chromosome 9p21.2 locus is most likely linked to C9orf72 repeat expansion status. As we have C9orf72 repeat expansion status available for only a few of the strata included in the present meta-analysis, we were not able to perform an analysis with C9orf72 wild-type carriers only to search for a residual effect of this locus in ALS-FTD. As a proxy for C9orf72 repeat expansion status, we conditioned on rs3849943 genotype and showed that the association signal at UNC13A increased. Moreover, it is interesting to note that C9orf72 is structurally related to DENN RabGEFs and that Rabs cooperate with UNC13A to regulate synaptic functions.34-36 Therefore, these observations

92 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS hint at the exciting possibility that defects in UNC13A and C9orf72 in ALS and FTD converge upon the same synaptic mechanisms. The present study identified one locus with increased disease association signal in the com- bined ALS and FTD meta-analysis on chromosome 8q24.13, nearing the genome-wide significance threshold. We successfully replicated disease association for the two top SNPs in this locus (rs13268726 and rs12546767) in an independent cohort of sporadic ALS patients. The 8q24.13 locus maps to a re- gion of high LD encompassing SQLE, KIAA0196 and NSMCE2 (Fig 4). Of these genes KIAA0196 appears to be of particular interest. First, mutations in KIAA0196 cause hereditary spastic paraplegia, which shares clinical upper motor neuron dysfunction with ALS.37 Second, KIAA0196 (alias SPG8) encodes for strumpellin, a valosin-containing protein (VCP) binding partner in the human central nervous system.38 Mutations in VCP have previously been identified in both ALS and FTD patients.13 Third, protein aggregates containing strumpellin have been found in patients with inclusion body myopathy associated with Paget disease of bone and frontotemporal dementia (IBMPFD).38 Nevertheless, since the combined analysis of the discovery and replication samples did not show a genome-wide significant association (p=1.01×10-7), further work is needed to establish the role of KIAA0196 and other genes in the associated locus in ALS-FTD pathogenesis. Our study is the largest GWAS including sporadic ALS and TDP-43 positive FTD patients in search of new loci for neurodegeneration. With over 4,300 ALS patients and over 14,000 controls the study was well powered to detect associations of common variants with modest effect size. For example, we estimat- ed 97% power for the detection of an association similar to rs3849943 on chromosome 9p.21 at α = 5×10-8. In terms of sample sizes required for GWAS, the FTD-TDP cohort was relatively small, but unique 6 due to the homogenous TDP-43 pathology. The relatively small increase in statistical power obtained by adding 435 TDP-43 positive FTD cases might provide an important explanation for why we did not identify new shared susceptibility loci with genome-wide significance. Also, to replicate our findings in TDP-43 positive FTD cohorts, one would require a minimum of 1,000 to 1,950 TDP-43 pathology-proven FTD cases and controls to achieve 80% power for detecting an effect with OR 0.8 and 1.2 (with minor allele frequency 0.1 – 0.35 atα =0.05), which is clearly challeng- ing and requires further international collaborations. In addition, future studies implementing a combined analysis with large cohorts of patients with neurodegenerative disease including FTD, late-onset Alzheim- er’s disease or Parkinson’s disease might add more statistical power and improve chances of finding new shared susceptibility loci for neurodegeneration. For the FTD-TDP cohort, clinical information was available allowing us to identify patients with and without motor neuron signs, although we cannot definitively rule out the possibility that a small proportion of FTD cases classified as “without motor neuron signs” still developed motor neuron disease after the last clinical follow-up. For the ALS strata, however, data on frontotemporal dementia or cognitive impairment in ALS patients were not available. Therefore, the extent to which signals from the meta-analysis are driven by signs of frontotemporal dementia or cognitive impairment is not known exactly, although previous studies

93 have estimated a proportion of 5-10% of ALS patients also having FTD.1,2 Furthermore, previously, an as- sociation was reported of variants in TMEM106B with susceptibility to FTD and with cognitive impairment in ALS patients.18,19 In the present meta-analysis we did find this locus in FTD, but we did not find a significant association of variants in TMEM106B with ALS. It is possible that an association with TMEM106B exists within a subset of ALS patients with cognitive impairment. Careful deep phenotyping of samples in future GWAS studies will help to shed light on the genetic determinants of motor neuron dysfunction versus cognitive impairment. In conclusion, our meta-analysis identifies UNC13A as a novel link between ALS and TDP-43 pos- itive FTD, which identifies synaptic defects as a shared disease mechanism and further corroborates the role of UNC13A and synaptic mechanisms in neuronal degeneration. Our results provide a novel starting point for further dissection of shared pathogenic pathways underlying ALS and FTD.

94 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

REFERENCES 1. Murphy J, Henry R, Lomen-Hoerth C. Establishing subtypes of the continuum of frontal lobe impairment in amyotrophic lateral sclerosis. Arch Neurol 2007;64:330-334. 2. Ringholz GM, Appel SH, Bradshaw M, et al. Prevalence and patterns of cognitive impairment in sporadic ALS. Neurology 2005;65:586-590. 3. Brettschneider J, Del Tredici K, Toledo JB, et al. Stages of pTDP-43 pathology in amyotrophic lateral sclerosis. Ann Neurol 2013;74:20-38. 4. Gijselinck I, Engelborghs S, Maes G, et al. Identification of 2 Loci at chromosomes 9 and 14 in a multiplex family with frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Arch Neurol 2010;67:606-616. 5. Le Ber I, Camuzat A, Berger E, et al. Chromosome 9p-linked families with frontotemporal dementia associated with motor neuron disease. Neurology 2009;72:1669-1676. 6. Vance C, Al-Chalabi A, Ruddy D, et al. Familial amyotrophic lateral sclerosis with frontotemporal dementia is linked to a locus on chromosome 9p13.2-21.3. Brain 2006;129:868-876. 7. Dejesus-Hernandez M, Mackenzie IR, Boeve BF, et al. Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 2011;72:245- 256 8. Laaksovirta H, Peuralinna T, Schymick JC, et al. Chromosome 9p21 in amyotrophic lateral sclerosis in Finland: a genome-wide association study. Lancet Neurol 2010;9:978-985. 9. Renton AE, Majounie E, Waite A, et al. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011;72:257-268. 10. Shatunov A, Mok K, Newhouse S, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 6 11. van Es MA, Veldink JH, Saris CGJ, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083- 1087. 12. Majounie E, Renton AE, Mok K, et al. Frequency of the C9orf72 hexanucleotide repeat expansion in patients with amyotrophic lateral sclerosis and frontotemporal dementia: a cross-sectional study. Lancet Neurol 2012;11:323-330. 13. Johnson JO, Mandrioli J, Benatar M, et al. Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron 2010;68:857-864. 14. Borroni B, Bonvicini C, Alberici A, et al. Mutation within TARDBP leads to frontotemporal dementia without motor neuron disease. Hum Mutat 2009;30:E974-983. 15. Mackenzie IR, Rademakers R, Neumann M. TDP-43 and FUS in amyotrophic lateral sclerosis and fronto- temporal dementia. Lancet Neurol 2010;9:995-1007. 16. Van Deerlin VM, Sleiman PMA, Martinez-Lage M, et al. Common variants at 7p21 are associated with frontotemporal lobar degeneration with TDP-43 inclusions. Nat Genet 2010;42:234-239. 17. Patsopoulos NA, the Bayer Pharma MS Genetics Working Group tSCoSEI-baaC-A, ANZgene Consortium, GeneMSA, International Multiple Sclerosis Genetics Consortium, de Bakker PIW. Genome-wide meta-analysis identifies novel multiple sclerosis susceptibility loci. Ann Neurol 2011;70:897-912. 18. van der Zee J, Van Langenhove T, Kleinberger G, et al. TMEM106B is associated with frontotemporal lobar degeneration in a clinically diagnosed patient cohort. Brain 2011;134:808-815.

95 19. Vass R, Ashbridge E, Geser F, et al. Risk genotypes at TMEM106B are associated with cognitive impairment in amyotrophic lateral sclerosis. Acta Neuropathol 2011;121:373-380. 20. Strong MJ, Lomen-Hoerth C, Caselli RJ, et al. Cognitive impairment, frontotemporal dementia, and the motor neuron diseases. Ann Neurol 2003;54:S20-S23. 21. Brooks BR, Miller RG, Swash M, et al. El Escorial revisited: revised criteria for the diagnosis of amyotrophic lateral sclerosis. Amyotroph Lateral Scler Other Motor Neuron Disord 2000;1:293-299. 22. Price AL, Patterson NJ, Plenge RM, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006;38:904-909. 23. Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 2009;5:e1000529. 24. de Bakker PIW, Ferreira MAR, Jia X, et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 2008;17:R122-128. 25. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. 26. Lee SH, Yang J, Goddard ME, et al. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 2012;28:2540-2542. 27. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: A Tool for Genome-wide Complex Trait Analysis. Am J Hum Genet 2011;88:76-82. 28. Yang J, Ferreira T, Morris AP, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 2012;44:369-375. 29. Breitling R, Armengaud P, Amtmann A, Herzyk P. Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments.FEBS Lett 2004;573:83-92. 30. Chiò A, Mora G, Restagno G, et al. UNC13A influences survival in Italian amyotrophic lateral sclerosis patients: a population-based study. Neurobiol Aging 2013;34:357.e351-355. 31. Diekstra FP, Van Vught PWJ, van Rheenen W, et al. UNC13A is a modifier of survival in amyotrophic lateral sclerosis. Neurobiol Aging 2012;33:630.e633-630.e638. 32. Varoqueaux F, Sons MS, Plomp JJ, Brose N. Aberrant morphology and residual transmitter release at the Munc13-deficient mouse neuromuscular synapse.Mol Cell Biol 2005;25:5973-5984. 33. Rothstein JD. Current hypotheses for the underlying biology of amyotrophic lateral sclerosis. Ann Neurol 2009;65(suppl 1):S3-9. 34. Huang C-C, Yang D-M, Lin C-C, Kao L-S. Involvement of Rab3A in Vesicle Priming During Exocytosis: Interaction with Munc13-1 and Munc18-1. Traffic 2011;12:1356-1370. 35. Levine TP, Daniels RD, Gatta AT, et al. The product of C9orf72, a gene strongly implicated in neurodegen- eration, is structurally related to DENN Rab-GEFs. Bioinformatics 2013;29:499-503. 36. Zhang D, Iyer LM, He F, Aravind L. Discovery of Novel DENN Proteins: Implications for the Evolution of Eukaryotic Intracellular Membrane Structures and Human Disease. Front Genet 2012;3:283. 37. Valdmanis PN, Meijer IA, Reynolds A, et al. Mutations in the KIAA0196 gene at the SPG8 locus cause hereditary spastic paraplegia. Am J Hum Genet 2007;80:152-161. 38. Clemen CS, Tangavelou K, Strucksberg K-H, et al. Strumpellin is a novel valosin-containing protein binding partner linking hereditary spastic paraplegia to protein aggregation diseases. Brain 2010;133:2920- 2941.

96 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

SUPPLEMENTARY INFORMATION Supplementary Table 1 Details of included cohorts and quality control I S S G N A U M FT D I U U S I I U U U U F N N N N A C D

B t t r r r a a e w w w oun tr y i L L e e e K S S K K K K e e e e e u anc e s l l l S S l l r t t t t t lti p

A A i e e ea s y y g an d an d he r he r h he r he r t

m

d d

ze r i r e u an y e e e r l m e e n n p l l l l l

an d an d an d an d and s l ,

li ca ti o an d coho rt s

s s s s

n

3 Q 5 1 7 4 6 2 K W C W Lon d K F T T B W M U G U U S H U C T N M D B D B E U T R R U U S H U B U G . .

. . . . . T o o o

vr y C

u

an t i l eau m eau m e L ou r o S oho r oho r ub li n ub li n o o a a M m m l M M M n n I ND GH n T T T oo d m H t t t D r A r

tt e ll e st h s s i i - a a a g' CCC 1 958 CCC 2 CCC 2 C C C C i e e v v = li n I 5

p p n G

l A l A l FT D 1 - 3

Lan d S v V S C v e e co ll abo r on ss p

å å

U c

s o it a it a coho rt , 6 n r q an an E U U U U da t & u rs it y rs it y

chy m ha t

h an e

t t

n n da m 1 , 2 1 , 2

C S N u

is be r on t H on t H U U L L

i t t t t

1 - 3 i ò l l A r r r r a

o e S S v , 1 , 2

n n E e e e e E

aban k

uno v C

tl an t lit y e D ll ege rv i

e C

i i 1 N

s s ch t ch t ch t ch t

A r v v it a 9 rs it y on s ee r

rs J E H H

h e i a e e S , M M 58 ck J C c g

a

p o o o o t rs it y rs it y S t The l S a con t e 1 , 2 1 , 2 1 , 2 1 , 2

i A r A u s s

s s a li ca ti o ona l o li n chy m t i

A B p p

p p

d

, , t i 1 - 3 B 4

r , o é t ,

i i i it a it a V y V t

. i

t t rt h M n

i

M r

1 , 2 , a a V r an u e

t 7

e o S l, l, l l m h

ok l M

d i l l cho l n

ck J C k ;

,

i V

nk J H i PAT S J , K ugh t P l , e

z Ill u 6 Ill u Ill u Ill u Ill u Ill u Ill u Ill u Il l Ill u Ill u Ill u KASP a Ill u P s Ill u Ill u KASP a Ill u KASP a Ill u Ill u Ill u KASP a N M i

, ili c m l 1

S = e u a 0 e R ,

an w m m m m m m m m m m m m m m m m m m m W t i K S pa ti en t e f n W o hou s o i i i i i i i i i i i i i i i i i i i

s , a i

na 1 na na na na na na 1 na na na na na na na na na na na na

nge r t r , P F r

agno r r r r m i ung B M , ,

s

. . 55 0 55 0 37 0 37 0 37 0 55 0 55 0 55 0 66 0 55 0 3 55 0 55 0 3 3 6 3 l

Ta q Taq M 2 2 e A au w CG J 1 1 1 1 1 M M

, 0 7 7 7 7 ;

V S

H M K K K K C

K G K K K , , K K K K K K K K K M

-

, O a

e e ,

, ,

H C

a a

e r t a t a e N i , M n n ti ne z n t a

t a

e

, = l. l.

t a l e

l . con t

C R 0 45 3 24 5 2 3 58 3 46 1 4 0 66 3 0 P 49 3 25 1 0 P 4 2 7 0 22 1 1 . t

G 0

- l 63 8 5 5 7 7 A r A 1 h e

. a Lage 2

3 e eno m 7 6 3 3 r t w d T G l

o .

uce d

Q r eno m m I

o TP R o C l o ; - 1 0 22 1 26 3 3 5 7 1 2 8 59 7 1 1 0 7 62 9 45 0 C 1 2 1

M s e s 5 4 27 0 2 4 4 M 2 0 7 1 0 7 1

o O t

- 03 8 3 3 1 7 1 , 1 2 age 1 4 e 1

4 A e m 0

2 w

N

e

8 8

xp r 4

n n - n n n

F a

i e t a

w

de

. . . . . s a a a a a a

=

9 i geno m . . . . .

de l. e

p m a s u ss i C 2 i ss oc i 934 0 55 1 55535 1 3 53546 8 3 3 3 3 93484 8 5844 1 56 1 555 1 3 56 1 3 S

56 1 62 0 no r a 1 i geno t o 0 7 7 7 0 0 on 1 N s mm on 7 0 0 0 ce p n 77 9 77 9 77 9 P 5 7 46 6 46 6 46 6 e 4 4 4

9 s 6 0 3 o s a - 0 0 0 1

0 p 7 4 ll e f 7 w 3 0 t ti b 0 0 0 yp i 4 4 4 1 t he i

o

on

i

de l r v e ilit y g a ng

a fr eq u d s r K a t i i 7 5 966 1 263 0 2654 7 2 8 52 7 7 538 1 5 7744 3 62 1 5336 9 Q 5346 9 555 0 8 M

7 c udy an t i i ss oc i n ne s 5 89 0 325 5 77 1 11 6 0 1 7

C A 4 a

ene 4 8 43 9 a F

6 6 m s 0 5 2 pe r 4 enc y

m 8

i i 0

5 4

8 yo t den t n

0 0 a a

yo t

- t 7

t i A n i on co h r

ss oc i ; r o p2 1 a

s i 4 8 628 0 268 3 2292 9 934 9 8 9 1 684 7 46 3 1 68 1 1 338 3 3 2 C

8 oph i H f p po r 4 3 7 0 1 0 1 67 0 7 52 6 i a s 3 e h 0 W 3 8

o 3 t 5 ll r 1 n n n n n i 3 s udy 9 8 r 0 4 0 c 5

. . . . . E a a 1 9p t a a a a a

c

9 r

:

t d

a . . . . . e l

= e

S

a l i t

a d c e N t o a H e t

f s ss oc i 3 e P a P a r 355 8 7 1 1 1 1 249 1 94 2 233 2 335 1 34 1 H 29 0 8 99 1 235 2 93 0 77 8 6

. r s a 0 4 47 4 2 m r r 3 6 7 44 1 W a

po r o d l 8 3 1

8 8

y l 1 t ( y 3 8 1

s

E

UN C o

e

6

s a c

-

tr o i

c

ad i t W l n e e l

e r d e 3 ph i r 1 o c i

3 o nbe r w s

(

K A s a i c s i i I m ) t s

3 1 2 9 1 1 0 0 0 0 1 Q r C 0 0 0 0 0 3 0 F h i

8

9

l and a A yo t n

5

a

a and C g

t fr on t P t he

t ll e E qu i

e 3 pe r

r r ) oph i 9p2 1 a

neu r gene l o U 1 7 2 6 4 6 4 2 1 1 5 2 0 1 1 1 1 zygo s H 0

s c l 2 2 1 0 0 2 0 co h t

2

i

K

e e b

c

.

t l m 2 r o

e an d e i l

o l i u r p a r og i a nc r o i o r m t o t n n n n n s t e - si s y r : . . . . . ;

r a a a a a

s ca ll y a s ea s a λ e u . . . . . l : a

l v s GC a

m l ce p en o s

e 3 8 1 4 0 0 5 6 chec k G 6 1 0 1 3 3 0 1 6 3

c p b g 6 1 5

= s

2

ende r no r l l eno m a

e

e o

geno m t s r r s i t u

b o he r

6 d m rv i i s egene r l i a i s e t

v

l y . co u -

a

con t H w i l l c

oc i i u i

de n i

m n n a tr i f o r s f t M U F B S F I D I U U Q N S CH 1 N S U I I N

o p l a T R R T i a T R e on l E W t W E o S K K S L L L ss oc i C 2 1 s o r 1 2 t r D s 1 1

1 2 3 1 r

i : s p 1 2 2 1 a

l

f i

1 2 : on f ac t

a

pe r

t G w a

d u rs t s

ene t o m i i geno a t c r h

t

a s i

a t on d T t r m

age i D o a 2 c m yo t r ba s t P 00 9 34 7 258 8 2865 2 5 1 22 7 1

9 7 P M 2 1 1 4

s

u 6 7 5 5 e a 2 - t 2 0 3 A 0 m i 7 udy .

1

43 m ss i - 1

T 1 2 r

ana l

8 w : 2 o

1

/

y ;

1 S

C i n p ed

o 8 i de N nc l g h O

t : Lance t N 1 y r

i P 52 4 c N o s on a s u

i ph i

s

l ss oc i s a i l and on s - t og i c n n n n n e 1 5

. . . . . r a a a a a l 26 9 1 1 98 5 2 25 3

1 9 43 3 26 4 6 ha p M 222 1 1

3 a a a s . 44 4 8 8 8 . . . . .

1 5 1 eu r l 2 t pub li c t t i N 1 4

6 6 6 i ss i e i

.

s c on

5

r a l

c

o

a

o n r t G l t e l e l g s y

r g s t

2 p

o ene t udy . b c r r 00 7 e s e l y e e i

ss i

s l r ea s .

o ; 2 P Lance t N on 6 s 0 r i :

e 86 9 s oc

1 .

r 0

o 0 2 4 2 0 5 0 2 Q 0 e 3 5 2 ca t D 1

N ; 4 42 : f da t N

s

4

up l - C a 9 u

8 a

t G e

lt s 77 . 23 4 pe r tl A s i eu r -

ene t a pe r

. cad -

s o Lance t N 239 . t l 0 2 1 3 1 0

0 1 3 55 3 0 1 R 1

r

2

1

s a 2 2

e

t

t S 0 00 9 r l u

a a c 1

m t 0 t i U n n n n n e u : ;

. . . . . 9 d ; m a a a a a 4 s :

eu r a . . . . . 98 6 1 ; S

: m 1 n

0 A . p o a 8

- 9 1 2 1 5 3 0

0 2 6 4 1 o Pop . 1

l l . 2 3 0

1 9 e 994 .

8 4 0 u

= 2 00 9 s

-

t

1 00 7

li e no t app li cab l 08 7

r ; 1 s ; 0 6

.

6 : 32 2 : 9 00 4

- 328 . 22 8 43 5 46 0 2 23 9 3 1 63 4 24 9 11 9 1 45 0 4 4 62 1 A 1 1 56 6 P 69 1 25 6 4 83 8 0 9 9 - 05 6 37 7 3 1 A 0 f 9 6 8 2 0 t e 5 T 1 4 e 00 9

.

r

Q 7 1 45 4 2 1 11 5 25 2 7 1 2 32 4 43 6 2 23 1 1 395 8 1 663 1 68 2 23 9 C 1 . C

4 6 2 4 30 1 4 0 3 0 1 7 O

1 1 1 6 3 9 4 0 9 8 4

4 9 N

7

3

7

2864 0 5 4476 0 32263 9 2 549 0 32354 9 3 2 2 2 5 2 48949 1

2 4825 1 2832 7 28 1 S

7

0 1 0 N 993 6 543 9 2 588 0 P 22 4 0 s 7

6 3 6 8 1 0 5

1 1 1 1 1 1 n 1 1 1 n

n n 1 n 1 1 1

...... λ 0 0 05 2 0 0 0 03 0 0 0 05 3 0 03 4 06 7 . . . . . a a a a a GC 1 2 2 1 1 1 2 1 . . . . . 5 7 3 7 3

6 3 2

97 Allele and genotype frequencies, and results per stratum for significantly associated SNPs Supplementary Table 2 IR1 available; d threshold genotypes Strata IT2 (repl) (repl) CH1 (repl) SW2 DE1 (repl) (repl)NL3 FTD IT1 US2 US1 SW1 IR2 IR1 UK2 UK1 FR1 BE1 NL2 NL1 rs12546767 (minor allele C) IT2 (repl) (repl) CH1 (repl) SW2 DE1 (repl) (repl)NL3 FTD IT1 US2 US1 SW1 IR2 irect g

are enotyping. G

0.9 )

repl

do na me d

= not deriv ed

replicati

essentiall 0.07 0.07 0.10 0.09 0.09 0.08 0.09 0.09 0.10 0.09 0.07 0.09 0.11 0.10 0.09 0.08 0.07 0.11 0.08 0.08 0.07 0.07 0.10 0.10 0.08 0.08 0.09 0.09 0.10 0.09 according enotype d

from on

y genotype cohort. 100.00 100.00 100.00 99.92 99.21 100.00 100.00 99.71 99.35 100.00 100.00 99.11 98.44 98.96 100.00 99.75 97.79 98.91 100.00 100.00 99.16 99.12 97.34 99.65 100.00 99.54 97.66 98.39 99.71 100.00 to differ istrib supplemen tary

uti on s from dosages

0.612 0.709 0.101 0.602 0.014 0.167 0.123 1.000 0.069 1.000 0.612 0.574 0.228 0.136 0.410 0.603 0.069 0.240 0.060 0.680 1.000 0.053 0.374 1.000 0.757 0.175 0.673 0.129 1.000 1.000 a association r e

present ed

obtained

table 85.2 / 14.8 / 0.0 87.1 / 12.4 / 0.5 / 12.4 / 87.1 82.3 / 15.7 / 2.0 82.4 / 16.9 / 0.7 83.3 / 14.9 / 1.7 85.5 / 13.3 / 1.1 83.5 / 14.9 / 1.6 0.7 / 17.0 / 82.3 79.9 / 19.9 / 0.2 0.0 / 18.0 / 82.0 85.2 / 14.8 / 0.0 82.3 / 16.6 / 1.1 79.4 / 20.6 / 0.0 77.9 / 22.1 / 0.0 1.5 / 16.7 / 81.8 0.7 / 17.1 / 82.2 80.2 / 19.6 / 0.2 83.3 / 16.7 / 0.0 1.6 / 16.0 / 82.4 84.4 / 14.8 / 0.8 / 14.8 / 84.4 85.7 / 13.8 / 0.5 80.2 / 17.3 / 2.5 83.6 / 16.4 / 0.0 85.8 / 13.7 / 0.5 86.5 / 13.0 / 0.5 85.2 / 13.6 / 1.2 0.8 / 14.8 / 84.4 83.3 / 15.1 / 1.6 0.7 / 17.0 / 82.3 0.0 / 18.0 / 82.0

1 results .

Minor by as either

AA usin allele /

g AB gen ome -w

dosage

frequen cies, /

BB,

data. wher e ide 0.10 0.07 0.10 0.10 0.11 0.10 0.11 0.11 0.12 0.08 0.10 0.10 0.12 0.13 0.10 0.11 0.11 0.12 0.11 0.09 0.10 0.08 0.10 0.10 0.07 0.10 0.10 0.11 0.11 0.08 PAT

beadchip

A call

represents

= rates, patient; 99.52 100.00 98.27 99.45 99.48 100.00 100.00 98.83 99.34 100.00 100.00 99.08 99.29 99.04 99.83 99.29 99.69 98.27 100.00 100.00 99.27 100.00 100.00 98.59 98.45 99.22 100.00 99.43 98.75 100.00 genotyping

Hardy -W

CON th e major 0.700 0.469 0.703 0.844 0.653 0.879 1.000 0.336 0.513 1.000 0.700 1.000 1.000 0.845 1.000 0.042 0.703 0.663 0.510 0.749 0.334 0.446 0.046 1.000 0.094 0.946 0.335 0.756 1.000 0.468 =

einberg contro or SNP

allele, l; MAF 81.2 / 18.4 / 0.5 / 18.4 / 81.2 86.1 / 13.3 / 0.6 0.4 / 18.5 / 81.1 1.0 / 19.0 / 80.0 79.0 / 19.9 / 1.0 0.9 / 17.6 / 81.6 79.8 / 19.4 / 0.8 1.6 / 19.3 / 79.1 1.8 / 20.8 / 77.4 0.0 / 15.9 / 84.1 80.8 / 18.8 / 0.5 0.9 / 19.0 / 80.1 77.9 / 20.7 / 1.4 / 20.7 / 77.9 79.7 / 19.4 / 1.0 / 19.4 / 79.7 75.7 / 22.9 / 1.4 80.2 / 17.3 / 2.5 0.4 / 18.5 / 81.1 78.4 / 20.5 / 1.1 1.8 / 20.7 / 77.5 79.4 / 19.8 / 0.8 79.2 / 19.2 / 1.6 82.4 / 16.3 / 1.3 81.1 / 17.5 1.4 84.3 / 15.2 / 0.5 81.5 / 16.9 / 1.6 1.0 / 18.3 / 80.7 85.9 / 13.4 / 0.7 0.8 / 17.5 / 81.8 0.0 / 15.9 / 84.1 81.8 / 16.9 / 1.3 equilibrium imputation. and = B

minor

th e For replication calculations minor

allele

allele. frequency; HWE 0.74 NA NA NA NA 0.69 0.86 0.82 0.84 1.08 0.72 0.90 NA NA NA NA 0.85 0.79 NA 0.88 0.82 0.70 0.70 0.77 NA 1.07 0.72 1.42 0.69 0.73 and genotype frequencies Th e strata,

tabl e genotype 0.241 NA NA NA NA 0.015 0.487 0.121 0.286 0.853 0.211 4.23 0.508 NA NA NA NA 0.223 NA 0.564 0.314 0.121 0.207 0.020 0.266 NA 0.857 0.122 1.53 0.188 shows

=

Hardy × ×

10

10 that - - 3 3 statistics are - Weinberg associati on 0.74 0.91 1.02 0.86 0.82 0.69 0.86 0.82 0.85 1.08 0.72 0.72 0.70 0.91 0.85 0.85 0.83 0.77 1.02 0.83 0.86 0.84 0.83 0.72 0.71 1.08 0.77 1.42 0.95 0.72 are b

equilibrium; ased based re sult 0.244 2.52 0.147 0.940 0.172 0.098 0.015 0.487 0.131 0.297 0.853 0.211 4.23 0.539 0.539 0.150 0.461 0.108 0.189 0.939 0.487 0.141 0.284 0.233 0.027 0.853 0.266 0.288 0.125 0.160 o n s on h × × using hard

ard -c

10 10 genotypes derived OR - - 3 3

= alled o dds -c alled

genotypes ratio;

NA from

= (ca ll n ot

98 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS UK2 UK1 FR1 BE1 NL2 NL1 rs13268726 (minor allele G) FTD IT1 US2 US1 SW1 IR2 IR1 UK2 UK1 FR1 BE1 NL2 NL1 rs12608932 (minor allele C) FTD IT1 US2 US1 SW1 IR2 IR1 UK2 UK1 FR1 BE1 NL2 NL1 rs3849943 (minor allele C) Stratum

0.09 0.34 0.36 0.39 0.33 0.36 0.40 0.36 0.28 0.24 0.24 0.25 MAF (ALSPAT or FTD) 0.08 0.43 0.35 0.31 0.27 0.26 0.23 0.30 0.08 0.11 0.08 0.07 0.37 0.43 0.41 0.39 0.30 0.30 0.27 0.26

98.89 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 ( Call rate 98.25 100.00 99.61 100.00 99.68 97.91 99.67 99.47 100.00 100.00 100.00 100.00 99.77 100.00 99.56 100.00 100.00 100.00 100.00 100.00 % )

0.404 0.208 0.000 0.770 0.261 0.104 0.045 0.474 0.635 1.000 0.855 0.279 0.374 0.005 0.637 0.186 0.032 0.238 1.000 0.279 0.391 0.559 0.325 0.058 0.090 0.750 0.390 0.178 1.000 0.602 0.850 0.030 p

HWE

82.7 / 16.2 / 1.1 45.7 / 41.0 / 13.3 46.6 / 34.1 / 19.3 36.5 / 48.5 / 15.0 48.0 / 39.0 / 13.0 44.0 / 40.7 / 15.3 38.9 / 41.8 / 19.2 41.7 / 43.9 / 14.5 51.8 / 41.4 / 6.8 57.2 / 36.8 / 5.9 56.9 / 37.5 / 5.6 57.8 / 34.9 / 7.3 83.5 / 16.5 / 0.0 35.4 / 42.3 / 22.3 53.7 / 38.4 / 7.8 52.8 / 41.5 / 5.7 38.8 / 48.3 / 12.9 56.0 / 42.0 / 2.0 54.6 / 37.6 / 7.8 81.2 / 16.2 / 2.6 83.7 / 16.3 / 0.0 86.1 / 13.3 / 0.5 42.8 / 43.8 / 13.5 32.9 / 47.5 / 19.6 37.8 / 45.6 / 16.7 44.9 / 47.2 / 7.8 50.2 / 40.5 / 9.3 56.0 / 36.0 / 8.0 85.5 / 14.1 / 0.5 / 14.1 / 85.5 46.0 / 48.1 / 5.9 35.9 / 47.2 / 17.0 48.3 / 42.7 / 9.0 Genotypes ( % )

MAF 0.10 0.32 0.31 0.35 0.37 0.34 0.34 0.32 0.23 0.24 0.20 0.25 CON 0.10 0.35 0.19 0.35 0.28 0.27 0.23 0.08 0.11 0.10 0.31 0.37 0.38 0.25 0.23 0.23 0.10 0.20 0.34 0.24

( Call rate 99.77 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 98.17 100.00 100.00 98.46 100.00 99.58 100.00 99.98 99.56 99.53 99.05 100.00 100.00 100.00 100.00 99.86 100.00 100.00 100.00 100.00 % )

1.000 0.143 0.077 0.604 0.186 0.537 0.545 0.932 0.719 0.599 0.275 0.181 p 0.125 0.101 0.878 0.040 0.707 0.522 1.000 0.080 0.075 1.000 0.838 0.719 0.003 0.360 0.315 0.754 0.104 0.398 0.478 0.301

HWE

6 80.0 / 19.1 / 0.9 / 19.1 / 80.0 43.5 / 48.1 / 8.4 44.8 / 48.0 / 7.1 43.4 / 44.1 / 12.6 42.1 / 41.3 / 16.7 44.7 / 42.8 / 12.5 44.1 / 43.2 / 12.7 45.8 / 43.6 / 10.6 60.3 / 34.1 / 5.6 57.5 / 37.2 / 5.3 65.4 / 29.3 / 5.3 57.7 / 34.6 / 7.7 Genotypes 41.6 / 47.3 / 11.1 82.2 / 16.4 / 1.4 65.9 / 30.8 / 3.3 59.4 / 35.8 / 4.8 80.3 / 17.2 / 2.5 42.1 / 45.3 / 12.6 37.8 / 49.1 / 13.1 56.3 / 38.1 / 5.5 52.5 / 38.7 / 8.8 54.0 / 38.9 / 7.1 81.0 / 17.7 / 1.4 / 17.7 / 81.0 0.5 / 15.1 / 84.4 1.0 / 18.1 / 80.9 47.5 / 43.4 / 9.1 43.8 / 38.6 / 17.6 58.9 / 35.5 / 5.6 60.8 / 32.6 / 6.7 43.4 / 44.6 / 11.9 56.7 / 37.7 / 5.6 64.3 / 30.5 / 5.2 ( % )

OR 0.88 1.00 1.26 1.22 0.81 1.09 1.27 1.11 1.30 1.03 1.27 1.01 Dosage 1.46 1.58 0.74 1.23 1.08 1.08 1.38 0.78 0.75 1.23 1.31 0.90 1.26 0.72 1.42 0.68 1.20 1.31 1.36 1.70

4.12 0.450 0.973 0.078 0.046 0.299 0.536 0.076 0.507 0.079 0.739 2.07 0.144 1.42 0.934 3.56 × p 6.57 0.271 0.417 0.261 0.304 0.011 0.017 5.53 0.491 0.159 0.127 0.142 0.094 1.33 1.80 2.02

× × × × × × × ×

10 10

10

10

10 10 10 10 10 - - - - - 3 3 - - - 3 - 5 6 4 4 3 5 0.73 0.69 0.88 1.46 1.01 1.26 1.23 0.81 1.09 1.27 1.11 1.31 1.38 1.31 1.03 1.58 1.27 1.36 1.70 1.01 1.23 Hard OR 0.76 1.08 1.08 1.31 0.90 0.78 1.37 1.24 1.27 0.75 1.20

- called

1.81 6.40 1.77 5.93 p 5.74 0.431 0.973 0.078 0.046 0.299 0.536 0.076 0.507 0.079 0.756 2.06 0.146 1.41 0.935 3.49 × 2.02 0.328 0.417 0.261 0.175 0.010 0.495 0.306 0.153 0.094 0.142 0.017

× × × × × × × × 10

10

10

10 10 10 10 10 10 - - - 3 3 5 ------3 3 4 6 5 4

99 Association results for SNPs previously associated with ALS or FTD Supplementary Table 3 allele frequency; = odds ratio. OR sporadic FTD Results are shown per disease, and for the combinedand FTD meta ALS 7 7 7 7 7 7 7 7 7 7 7 7 19 9 9 7 1 12 Chr 9

TMEM106B UNC13A 9p21.2 9p21.2 FGGY ITPR2 TMEM106B TMEM106B TMEM106B TMEM106B TMEM106B TMEM106B TMEM106B Locus TMEM106B 9p21.2 TMEM106B TMEM106B TMEM106B DPP6 - TDP casesTDP and 1,414 population controls. The meta

rs12608932 rs2306677 rs6952272 rs10226395 rs1990602 rs1006869 rs6945902 rs3849942 rs2814707 rs10260404 rs6700125 rs10488192 rs1990622 rs1020004 SNP rs1468915 rs903603 rs6966915 rs12671332 rs1003433

ALS ALS FTD FTD FTD FTD FTD FTD FTD ALS ALS ALS ALS FTD FTD associated with Previously FTD ALS FTD FTD

C A T A T C C G A G C G C G T T C T Minor allele Minor A

-

ana lysis comprises a total of 4,811 patients with either or ALS FTD, and 14,428 controls. Chr, = controls; = chromosome; CON MAF - analysis. analysis The ALS is based on 4,377 sporadic patients ALS 13,017 and controls. The FTD analysis is based on 435

1.18 1.02 0.97 0.95 1.02 1.02 1.03 1.01 OR 1.11 0.97 1.00 1.01 1.02 1.00 1.22 1.22 1.08 1.06 ALS analysisALS 0.89

0.55 0.25 0.11 0.026 0.58 0.61 0.49 0.88 1.70 2.56 1.77 p 0.91 0.26 0.68 0.96 0.57 9.12 0.012 0.067

× × ×

×

10 10 10

10 - - - 9 - 10 8 5

0.71 0.68 0.78 0.77 0.68 0.68 0.71 0.76 0.84 0.80 0.78 1.46 0.89 1.34 OR 1.38 0.77 0.72 1.04 1.07 FTD analysis

7.27 3.46 5.93 0.18 2.04 1.16 0.01356 4.40 0.12 0.061 0.025 6.57 × 1.77 p 0.0151 1.90 0.41 0.075 5.41 0.67

× × ×

×

×

× ×

× × 10

10 10 10

10 10

10 10 10 10 - - - 3 -

- 3 - - 4 - - - 3

6 6 5 6

3 3

0.89 0.93 0.93 1.07 0.93 0.99 0.98 0.99 0.98 1.00 1.00 0.98 1.21 1.23 OR 0.99 0.96 1.24 1.07 1.06 Meta

- analysis 0.022 0.013 0.12 0.012 0.55 0.40 0.72 0.67 0.16 0.81 0.99 0.94 0.59 1.02 1.02 1.89 p 0.012 0.045 4.40

×

× ×

×

10 10 10 10 - - - - 11 11 5 12

= minor

100 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS Results for rank products analysis of ALS and FTD Supplementary Table 4 chromosome; OR = odds ratio;chromosome; OR = motor neuron disease; MND = rank RP product. value < 5 FTD cases with motor neuron symptoms. Results are so The upper section of the table shows results for the rank products analysis in strata, and FTD ALS while the lower part of th 9 9 9 4 17 9 9 9 9 9 19 Chr 9 9 9 9 19 Chr vs. FTD ALS 9 9 ALS vs. FTD without motor neuron symptoms vs. neuron without motor FTD ALS 9 9

× rs774359 rs774359 rs700791 rs10967965 SNP rs2814707 rs17779457 rs700791 rs3849943 rs12608932 SNP rs895021 rs10967965 rs17779457 rs6853653 rs2453554 rs3849942 rs3849943 rs12608932 rs3849942 rs2453554 rs8070348 10

- 8

were consideredwere to be significantly top

C9orf72 C9orf72 C9orf72 C9orf72 C9orf72 Nearest gene MOBKL2B MOBKL2B UNC13A Nearest gene C9orf72 MOBKL2B SHROOM3 C9orf72 C9orf72 C9orf72 UNC13A MOBKL2B MOBKL2B MOBKL2B C9orf72 CDRT7

C Minor allele Minor allele Minor C A A C C T G G C T T C C T T A T T G

- rted by increasing rank product. Empirical ranked for both diseases. is For each SNP indicated whet

+ Genotyped Genotyped + - - - + + - - + - + - + + - - - - -

1.22 (1.51 1.22 (1.51 1.22 (5.48 1.18 (1.09 1.18 1.22 (1.77 1.22 1.21 (2.93 Association in ALS 1.21 (2.93 0.92 (0.019) (3.06 1.21 1.22 (9.12 1.22 (5.48 (1.70 1.18 ( OR (1.09 1.18 (3.06 1.21 1.22 (9.12 1.18 ( OR Association in ALS 1.24 (5.80 1.21 (5.26 1.24 (5.80 1.35 (0.060)

(1.70 p p ) )

× × × × × × × × × × × × × × × × × × 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10

------9 9 8 8 - - 7 7 9 - - 10 10 9 9 9 - - 9 9 ) ) ) ) ) ) 10 10 ) 10 10 ) ) )

) )

) )

) )

) )

p

values are estimated for each rank product after 100,000,000 permutations. with SNPs an empirical p Rank Rank 2 2 1 4 4 10 6 12 6 25693 7 3 1 11 12 7 3 11 5 78682

her genotypes were obtained by direct genotyping (+) or by imputation ( 6 OR ( OR ( OR 1.24 (0.050) 1.39 (8.41 1.38 (5.53 1.25 (0.032) 1.39 (4.21 1.35 (1.06 1.35 Association in FTD without MND 1.36 (8.35 1.22 (0.049) (1.55 0.61 1.26 (0.026) 1.23 1.23 (0.042) 1.39 (4.52 signs 1.37 (5.67 1.39 (3.35 1.38 (5.41 1.46 (6.57 × Association in FTD 1.25 (0.028) 1.34 (1.77 1.34 6.27 (8.80 p p (0.040)

) ) e table shows results in strata ALS and in the stratum FTD after removal of 99

× × × × × × × × × × × × 10 10 10 10 10 10 10 10 10 10 10 10

10 ------3 - - - 4 4 4 - - - 5 3 4 6 4 ) - 4 4 4 ) ) ) ) ) )

6 ) )

) ) )

)

Rank Rank 1537 1226 843 43501 681 37372 2496 1245 65343 13 34651 54196 55629 643 859 542 834 15 3 66299

RP RP 123.98 85.77 29.03 417.0 52.19 669.4 111.71 49.90 626.0 570.4 492.3 403.1 235.8 84.1 productsRank results productsRank results 101.53 61.60 50.02 12.85 479.4 364.1

p p 1.3 1.0 <1 2.07 3.0 5.30 1.5 5.0 4.91 4.29 2.92 1.90 7.20 8.00 1.0 3 5.0 <1 × 2.99 1.61 .0

× × × × × × × × × × 10 10 × × × × × × × 10 10 10 10 × × 10 10 10 10 10 - - 10 10 10 8 8 10 10 10 10 10 10 - - - -

7 7 - - - - 7 7 -

- 8 8 8 8 6

-

- - - 6 - - - 6

- 6 6 6 6 7 6

8

- ). Chr, = = Chr, ).

101 Supplementary Table 5 Contributors List

THE INTERNATIONAL COLLABORATION FOR FRONTOTEMPORAL LOBAR DEGENERATION Vivianna M Van Deerlin, Patrick M A Sleiman, Maria Martinez-Lage, Alice Chen-Plotkin, Li-San Wang, Neill R Graff-Radford, Dennis W Dickson, Rosa Rademakers, Bradley F Boeve, Murray Grossman, Steven E Arnold, David M A Mann, Stuart M Pickering-Brown, Harro Seelaar, Peter Heutink, John C van Swieten, Jill R Murrell, Bernardino Ghetti, Salvatore Spina, Jordan Grafman, John Hodges, Maria Grazia Spillantini, Sid Gilman, Andrew P Lieberman, Jeffrey A Kaye, Randall L Woltjer, Eileen H Bigio, Marsel Mesulam, Safa al-Sarraj, Claire Troakes, Roger N Rosenberg, Charles L White III, Isidro Ferrer, Albert Lladó, Manuela Neumann, Hans A Kretzschmar, Christine Marie Hulette, Kathleen A Welsh-Bohmer, Bruce L Miller, Ainhoa Alzualde, Adolfo Lopez de Munain, Ann C McKee, Marla Gearing, Allan I Levey, James J Lah, John Hardy, Jonathan D Rohrer, Tammaryn Lashley, Ian R A Mackenzie, Howard H Feldman, Ronald L Hamilton, Steven T Dekosky, Julie van der Zee, Samir Kumar-Singh, Christine Van Broeckhoven, Richard Mayeux, Jean Paul G Vonsattel, Juan C Troncoso, Jillian J Kril, John B J Kwok, Glenda M Halliday, Thomas D Bird, Paul G Ince, Pamela J Shaw, Nigel J Cairns, John C Morris, Catriona Ann McLean, Charles DeCarli, William G Ellis, Stefanie H Freeman, Matthew P Frosch, John H Growdon, Daniel P Perl, Mary Sano, David A Bennett, Julie A Schneider, Thomas G Beach, Eric M Reiman, Bryan K Woodruff, Jeffrey Cummings, Harry V Vinters, Carol A Miller, Helena C Chui, Irina Alafuzoff, Päivi Hartikainen, Danielle Seilhean, Douglas Galasko, Eliezer Masliah, Carl W Cotman, M Teresa Tuñón, M Cristina Caballero Martínez, David G Munoz, Steven L Carroll, Daniel Marson, Peter F Riederer, Nenad Bogdanovic, Gerard D Schellenberg, Hakon Hakonarson, John Q Trojanowski, Virginia M-Y Lee.

THE SLAGEN CONSORTIUM Isabella Fogh, Antonia Ratti, Cinzia Gellera, Kuang Lin, Cinzia Tiloca, Valentina Moskvina, Lucia Corrado, Gianni Sorarù, Cristina Cereda, Stefania Corti, Davide Gentilini, Daniela Calini, Barbara Castellotti, Letizia Mazzini, Giorgia Querin, Stella Gagliardi, Roberto Del Bo, Francesca Luisa Conforti, Cosenza, Gabriele Siciliano, Maurizio Inghilleri, Francesco Saccà, Paolo Bongioanni, Silvana Penco, Massimo Corbo, Sandro Sorbi, Massimiliano Filosto, Alessandra Ferlini, Anna Maria Di Blasio, Stefano Signorini, Nicola Ticozzi, Mauro Ceroni, Elena Pegoraro, Giacomo P Comi, Sandra D’Alfonso, Franco Taroni, Ammar Al-Chalabi, John Powell and Vincenzo Silani.

102 C9ORF72 AND UNC13A ARE SHARED RISK LOCI FOR ALS AND FTD: A GENOME-WIDE META-ANALYSIS

6

103

PART III Genetic disease modifiers

7

UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS

NEUROBIOL AGING. 2012;33(3):630.E3-8

Frank P Diekstra, Paul WJ van Vught, Wouter van Rheenen, Max Koppers, R Jeroen Pasterkamp, Michael A van Es, H Jurgen Schelhaas, Marianne de Visser, Wim Robberecht, Philip Van Damme, Peter M Andersen, Leonard H van den Berg*, Jan H Veldink*

* These authors contributed equally to the manuscript ABSTRACT A large genome-wide screen in patients with sporadic amyotrophic lateral sclerosis (ALS) showed that the common variant rs12608932 in gene UNC13A was associated with disease susceptibility. UNC13A regulates the release of neurotransmitters, including glutamate. Genetic risk factors that, in addition, modify survival, provide promising therapeutic targets in ALS, a disease whose etiology remains largely elusive. We examined whether UNC13A was associated with survival of ALS patients in a cohort of 450 sporadic ALS patients and 524 unaffected controls from a population-based study of ALS in The Netherlands. Additionally, survival data were collected from individuals of Dutch, Belgian or Swedish descent (1,767 cases, 1,817 controls), who had participated in a previously published genome-wide association study of ALS. We related survival to rs12608932 genotype. In both cohorts, the minor allele of rs12608932 in UNC13A was not only associated with susceptibility, but also with shorter survival of ALS patients. Our results further corroborate the role of UNC13A in ALS pathogenesis.

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a fatal adult-onset neurodegenerative disorder characterized by progressive muscle weakness due to the loss of upper and lower motor neurons. No cure is available for ALS and the underlying pathogenesis remains largely elusive. Sporadic ALS is attributed to a combination of genetic and environmental risk factors. Recently, a twin study of sporadic ALS patients has estimated hereditability to be considerable (0.38-0.76), indicating an important genetic component in disease etiol- ogy.1 Multiple genome-wide association studies (GWAS) have been performed in ALS and to date, several loci have been shown to be associated, including UNC13A and a locus on chromosome 9p21.2.2-8 The association with UNC13A was found in a two-stage GWAS comprising 4,855 ALS patients and 14,953 unaffected controls (rs12608932,p = 2.50 × 10-14, for the combined analysis of two stages).8 Replication of these results has proven difficult because of very small effect sizes.5, 9 In ALS, where etiology is largely unknown, risk factors that, in addition, possess disease-modifying properties provide promising therapeutic targets.10 These risk factors can best be studied in a popula- tion-based cohort of incident ALS patients to reduce referral bias and overrepresentation of patients with better prognosis (spinal onset, young age) as found in cohorts selected from tertiary care institutions or with prevalent cases only.11-14 In the present study we examined whether UNC13A has disease-modifying properties, including association with age at onset and with survival, possibly further corroborating its role in ALS. For this pur- pose, we recruited an independent, population-based cohort of incident Dutch ALS patients and controls. Subsequently, survival data were collected for cohorts from the previous GWAS and tested for association with ALS.

108 UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS

METHODS POPULATION-BASED COHORT SUBJECTS Patient characteristics are outlined in Table 1. For the population-based cohort, patients were included from the neuromuscular centers of the University Medical Center Utrecht, the Academic Medical Center Amsterdam, and the Radboud University Nijmegen Medical Center as part of an ongoing population-based study of ALS in The Netherlands.15 This study is performed in the Netherlands (41,528 km2, population 16,455,911 people). For the present study, incident ALS cases were identified from January 1, 2006 to December 31, 2009. Prevalent cases were all cases diagnosed before December 31, 2008 and still alive at that date. To ensure complete case ascertainment multiple sources were used. In addition to the University Medical Centers (UMC) cooperating in the Netherlands ALS Center (Amsterdam, Utrecht and Nijmegen), all remaining UMCs not participating in the Netherlands ALS Center and the 30 largest of the 83 general hospitals were visited each year to screen their registers for ALS patients. After diagnosis has been made, patients in the Netherlands are referred to one of the 46 rehabilitation centers specialized in the care of ALS. All centers were visited every year to scrutinize their registers for ALS patients. Lastly, patients were recruited by the Dutch Patient Advocacy Group for Neuromuscular Diseases. Patients who had not participated in previous GWASs were selected (n = 450). All patients fulfilled the El Escorial criteria for possible, probable or definite ALS.16 Cases with a family history of ALS or non-Caucasian descent were excluded. Control samples were recruited in the population-based study through the general practitioner of the participating patient. The Dutch health care system ensures that all people are registered with a general practitioner. The general practitioner was asked to send information about our study to five people following the patient in the alphabetized register, matched for gender and age plus or minus five years. We included 524 controls whose DNA was available for analysis. Table 1 Patient characteristics 7 ALS patients Controls Mean age at Site of onset, yr onset, Mean age, yr Cohort Country n Sex, F (range) bulbar n Sex, F (range) Population- based The Netherlands 450 39.6% 61.3 (16-88) 35.5% 524 43.5% 63.4 (21-92) GWAS The Netherlands 1012 40.5% 60.7 (20-88) 32.3% 1038 41.3% 62.2 (21-92) Belgium 299 40.1% 58.2 (18-85) 25.9% 323 55.7% 63.2 (6-88) Sweden 458 41.7% 61.2 (20-87) 40.1% 456 45.6% 59.7 (20-94) Total GWAS 1769 40.8% 60.3 (18-88) 32.7% 1817 45.0% 61.7 (6-94) In the population-based cohort there was missing data on either age at onset or site of onset in 5 patients, while in the GWAS cohort there were 224 patients and 1 control with missing data on either age at onset or site of onset. ALS: amyotrophic lateral sclerosis; GWAS: genome-wide association study; F: female.

GWAS COHORT SUBJECTS In addition, we collected data on survival and age at onset of sporadic ALS patients and unaffected indi- viduals from a previously published GWAS.8 These data were available for two Dutch cohorts, one Belgian and one Swedish cohort.

109 Patients in the Dutch cohorts were included through the neuromuscular centers of the University Medical Center Utrecht, the Academic Medical Center Amsterdam, or the Radboud University Nijmegen Medical Center. Approximately 50% of the Dutch patients were included through the above-described ongoing pop- ulation-based study of ALS in The Netherlands. ALS patients had been screened for superoxide dismutase 1 (SOD1) and angiogenin (ANG) gene mutations, and only patients without mutations in these genes were included. All four grandparents of patients were originating from The Netherlands. Control subjects were unrelated volunteers who were spouses of patients or who accompanied patients to the general neurology outpatient clinic. For patients participating in the ongoing population-based study, controls were collected through the general practitioner. Only controls with no medical history of neurological disorders were in- cluded and they were matched to patients for age and sex. For the Belgian cohort, patients that had been referred to the University Hospital Gasthuisberg, Leuven were included. Patients reported to be of Flemish descent for at least three generations. The Bel- gian controls included unrelated, healthy Flemish individuals who had married into families that participated in genetic studies of other neurological disorders. Sporadic ALS patients from Sweden were referred to the Umeå University ALS Clinic and had self-reported Swedish descent for at least three generations. Control samples in the Swedish cohort were spouses of ALS patients or age and gender-matched, unrelated, healthy controls recruited through the neurological outpatient clinic. For all cohorts, the diagnosis of probable or definite ALS was made according to the 1994 El Es- corial criteria16, by specialized neuromuscular neurologists. Patients with a positive family history for ALS were excluded.

GENOTYPING Genotyping of the rs12608932 SNP in the independent population-based cohort was carried out by use of capillary sequencing. Detailed sequencing methods are available in the Supplementary Data. For patients in the GWAS cohort, genotypes were extracted from Illumina HumanHap 300K and HumanCNV 370K SNP chip data for rs12608932 only, since none of the other SNPs were in linkage disequilibrium (LD) with this variant. Quality control was applied as described previously.8 Concordance be- tween direct sequencing and SNP chip genotyping techniques was determined by additionally sequencing 61 randomly selected samples that had been used as control subjects in the previous GWAS.8

STATISTICAL METHODS Survival analyses were carried out using Cox regression models, adjusted for gender, age at onset and site of onset (bulbar or spinal). Additionally, in the analysis of the GWAS cohorts, we included a dummy-coded country variable to adjust for possible heterogeneity between countries. Duration of survival was defined as the interval between the age at first symptoms (limb muscle weakness or difficulties with swallowing/

110 UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS speech) and age at death or tracheostomy. We tested the following genetic models; additive (AA vs. AB vs. BB genotypes), dominant (AA vs. AB + BB) and recessive (AA + AB vs. BB), where A represents the major allele and B the minor. Cox regression models were tested for non-proportional hazards using a χ2 test for correlation between the scaled Schoenfeld residuals of each covariate and survival time. Since survival data showed proportional hazards (p > 0.05), results were derived from Cox regression, and there was no need for using a Peto-Prentice Generalized Wilcoxon test, which would weigh earlier events more heavily. ANOVA was used to determine which genetic model was best to fit the data (additive, dominant or reces- sive). Kaplan-Meier survival curves were estimated for rs12608932 genotypes according to a recessive model. Survival analyses were carried out using the survival package for R statistical software v2.10 (R Foundation for Statistical Computing, Vienna, Austria) and SPSS v15.0 (SPSS Inc, Chicago, IL). For association with disease susceptibility we performed logistic regression in PLINK v1.07.17 In the population-based cohort the logistic model was adjusted for gender. For the GWAS cohort gender, dummy-coded nationality and ancestry (defined by the first two dimensions of a multidimensional scaling analysis of genome-wide data) were included as covariates. The covariates used in this model were similar to those in the original GWAS.8 Tests for deviation from Hardy-Weinberg equilibrium in controls were performed in PLINK using the program’s default exact test.18

Table 2 Results for survival and disease susceptibility analyses

Genotyped, n MAF Survival Susceptibility

Cohort Country ALS CON ALS CON Mortality HR (95% CI) OR (95%CI) Population- based The Netherlands 412 481 0.41 0.36 47.7% 1.62 (1.16-2.26)** 1.91 (1.31-2.79)** 7 GWAS The Netherlands 1011 1038 0.40 0.36 66.3% 1.18 (0.97-1.44) 1.46 (1.14-1.87)** Belgium 298 323 0.43 0.37 83.1% 1.22 (0.89-1.67) 1.20 (0.79-1.81) Sweden 458 456 0.39 0.35 100% 1.46 (1.00-2.12)* 1.22 (0.83-1.79) Total GWAS 1767 1817 0.40 0.36 74.5% 1.23 (1.06-1.43)* 1.33 (1.10-1.60)** In the survival analysis 5 cases were excluded from the population-based cohort and 263 cases from the GWAS cohort due to missing covariate data. Results are shown for analyses under a recessive model. GWAS: genome-wide association study; MAF: minor allele frequency; ALS: amyotrophic lateral sclerosis patients; CON: control subjects; HR: hazard ratio; CI: confidence interval; OR: odds ratio; * = p < 0.05; ** = p < 0.005.

RESULTS DISEASE SUSCEPTIBILITY The genotyping rate in the population-based cohort was 92% in both cases and controls. After quali- ty control, the overall genotyping rate for rs12608932 in the GWAS cohort was 99.9%. Comparison of rs12608932 genotypes obtained by direct sequencing and from Illumina SNP chip data in 61 control samples yielded a 100% genotype concordance rate.

111 ALS susceptibility association test results are presented in Table 2. In both population-based and GWAS cohorts, there was a significant association with rs12608932 (p = 0.001 and p = 0.002, respectively). The minor allele in ALS patients showed a slightly, but non-significantly, higher frequency in the popula- tion-based cohort compared to the GWAS cohort (0.41 vs. 0.40; p = 0.66, Pearson χ2), while among controls, frequencies were equal between cohorts (p = 1, Pearson χ2). In both cohorts, rs12608932 was in Hardy-Weinberg equilibrium (population-based cohort controls p = 0.11, GWAS cohort controls p = 0.36).

SURVIVAL Survival analyses results are shown in Table 2. Association with survival was tested in 412 patients in the population-based cohort. We found association with survival for both additive (p = 0.01, hazard ratio (HR) = 1.28) and recessive genetic models. ANOVA analyses showed that association between rs12608932 and survival fitted the recessive model best (p < 0.001). Figure 1 shows Kaplan-Meier survival curves for the population-based cohort according to rs12608932 genotype status in a recessive model. Sub- sequently, we collected survival data for the GWAS cohort and tested for an effect ofUNC13A on survival. Data on survival were available for 1,504 ALS patients in the GWAS cohort. Again, patients homozygous for the minor allele of rs12608932 had significantly shorter survival compared to the other genotype groups (recessive model, p = 0.01, HR 1.23). The difference in median survival between genotypic groups was 10.0 months in the population-based cohort and 5.0 months in the GWAS cohort.

Figure 1

Kaplan-Meier curves for rs12608932 genotypes according to a recessive genetic model in the population-based cohort. The black curve is for AA or AC genotypes, the grey curve is for the CC genotype. C is the minor allele. The curves are adjusted for the covariates used in the survival analysis.

112 UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS

AGE AT ONSET There was no significant association with age at onset in either of the tested cohorts for any of the tested genetic models. Results for these analyses are reported in Suppl. Table 1.

DISCUSSION In the present study, we report the association of rs12608932 in UNC13A with shorter survival of spo- radic ALS patients in an independent, population-based incident Dutch cohort. The effect on survival was also present in patients from our previously published GWAS for whom survival data were available. This indicates that UNC13A might act as a disease-modifying gene, further corroborating its role in ALS patho- genesis. Furthermore, the minor allele of rs12608932 was associated consistently with susceptibility to ALS in the independent, population-based cohort. Two genetic variants are in LD with rs12608932 (r2 > 0.5, CEU 1000 genomes pilot 1).19 All of these variants map to UNC13A, strongly implicating this gene in ALS susceptibility and survival. The UNC13A gene encodes protein unc-13 homolog A that is part of a family of presynaptic pro- teins in the brain. The UNC13A protein is involved in the regulation of neurotransmitter release at synapses, including at neuromuscular junctions. Neurotransmitters including glutamate are released by exocytotic fusion of presynaptic vesicles, a process triggered by membrane depolarization and concomitant influx of Ca2+ ions.20 Prior to fusion with the presynaptic membrane, vesicles are recruited to the membrane and primed for exocytosis, which forms an important regulatory step in neurotransmitter release.21 Disruption of this priming process by altered function of UNC13A could lead to changes in neurotransmitter release and might, ultimately, lead to death of motor neurons.22 Since UNC13A can directly regulate glutamate release, this mechanism provides support for the glutamate excitotoxicity hypothesis in ALS.23 Notably, the only drug with proven effect on survival in ALS is riluzole, a glutamate release inhibitor.24 In the present study, we found a higher minor allele frequency (MAF) in cases (0.41) than was 7 found in previously published populations from France (0.34) and the UK (0.37).5, 9 In control subjects, the MAF was comparable to other studies (0.34 - 0.36).5, 8, 9 The higher minor allele frequency might be explained by the following. Previous studies, including studies of KIFAP3 as a modifier of survival, have shown that estimations of determinants of disease characteristics such as survival are strongly dependent on patient selection.11, 13, 14, 25 These studies demonstrated greater proportions of short survivors in incident, population-based cohorts than in prevalent cohorts. In the present study, the rs12608932 minor allele was associated with shorter survival. Given this association, MAF may be strongly dependent on patient selec- tion criteria, leading to higher frequencies in patients in incident, population-based cohorts, which include ALS patients with shorter median survival times. Patients in the independent population-based cohort in the present study, and to a certain extent in the GWAS cohort, were selected from a population-based study of ALS in The Netherlands, which most probably explains the relatively high MAF of rs12608932 in our co- horts. Conversely, the use of referral-based or prevalent cases would imply a lower proportion of carriers of

113 the rs12608932 minor allele, and thus could lead to non-replication of UNC13A as a susceptibility gene for ALS. Therefore, referral bias, combined with reduced power (18 and 52%, respectively), forms a plausible explanation for non-replication of this variant in the French and British cohorts.5, 9 As UNC13A might act as a modifier of disease survival, patient selection methods form an impor- tant aspect of study design when trying to replicate findings. Larger independent studies may be needed to establish a definite role forUNC13A in ALS susceptibility and survival. Ideally, study cohorts are derived from population-based studies. Functional studies could provide insight into a pathogenic mechanism linking the genetic variant rs12608932 in UNC13A to motor neuron degeneration in ALS. Ultimately, therapeutic targets in the UNC13A pathophysiological pathway might be identified and targeted to prolong the survival of ALS patients.

114 UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS

REFERENCES 1. Al-Chalabi A, Fang F, Hanby MF, Leigh PN, Shaw CE, et al. An estimate of amyotrophic lateral sclerosis heritability using twin data. J Neurol Neurosurg Psychiatry 2010;81:1324-1326. 2. Chiò A, Schymick JC, Restagno G, Scholz SW, Lombardo F, et al. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2009;18:1524-1532. 3. Dunckley T, Huentelman MJ, Craig DW, Pearson JV, Szelinger S, et al. Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med 2007;357:775-788. 4. Schymick JC, Scholz SW, Fung H-C, Britton A, Arepalli S, et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol 2007;6:322-328. 5. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 6. van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol 2007;6:869-877. 7. van Es MA, van Vught PWJ, Blauw HM, Franke L, Saris CGJ, et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet 2008;40:29-31. 8. van Es MA, Veldink JH, Saris CGJ, Blauw HM, van Vught PWJ, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087. 9. Daoud H, Belzil V, Desjarlais A, Camu W, Dion PA, et al. Analysis of the UNC13A gene as a risk factor for sporadic amyotrophic lateral sclerosis. Arch Neurol 2010;67:516-517. 10. Cheung YK, Gordon PH, Levin B. Selecting promising ALS therapies in clinical trials. Neurology 2006;67:1748-1751. 11. Chiò A, Logroscino G, Hardiman O, Swingler R, Mitchell D, et al. Prognostic factors in ALS: A critical review. 7 Amyotroph Lateral Scler 2009;10:310-323. 12. Laaksovirta H, Peuralinna T, Schymick JC, Scholz SW, Lai S-L, et al. Chromosome 9p21 in amyotrophic lateral sclerosis in Finland: a genome-wide association study. Lancet Neurol 2010;9:978-985. 13. Sorenson EJ, Mandrekar J, Crum B, Stevens JC. Effect of referral bias on assessing survival in ALS. Neurology 2007;68:600-602. 14. Traynor BJ, Nalls M, Lai S-L, Gibbs RJ, Schymick JC, et al. Kinesin-associated protein 3 (KIFAP3) has no effect on survival in a population-based cohort of ALS patients.Proc Natl Acad Sci U S A 2010;107:12335-12338. 15. Huisman MHB, de Jong SW, van Doormaal PTC, Weinreich SS, Schelhaas HJ, et al. Population based epidemiology of amyotrophic lateral sclerosis using capture-recapture methodology. J Neurol Neurosurg Psychiatry 2011. 16. Brooks BR. El Escorial World Federation of Neurology criteria for the diagnosis of amyotrophic lateral sclerosis. Subcommittee on Motor Neuron Diseases/Amyotrophic Lateral Sclerosis of the World Federation

115 of Neurology Research Group on Neuromuscular Diseases and the El Escorial “Clinical limits of amyo trophic lateral sclerosis” workshop contributors. J Neurol Sci 1994;124 (suppl.):96-107. 17. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. 18. Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 2005;76:887-893. 19. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O’Donnell CJ, et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap.Bioinformatics 2008;24:2938-2939. 20. Augustin I, Rosenmund C, Südhof TC, Brose N. Munc13-1 is essential for fusion competence of glutama- tergic synaptic vesicles. Nature 1999;400:457-461. 21. Zikich D, Mezer A, Varoqueaux F, Sheinin A, Junge HJ, et al. Vesicle priming and recruitment by ub Munc13-2 are differentially regulated by calcium and calmodulin.J Neurosci 2008;28:1949-1960. 22. Varoqueaux F, Sons MS, Plomp JJ, Brose N. Aberrant morphology and residual transmitter release at the Munc13-deficient mouse neuromuscular synapse.Mol Cell Biol 2005;25:5973-5984. 23. Rothstein JD. Excitotoxicity hypothesis. Neurology 1996;47:S19-26. 24. Miller RG, Mitchell JD, Lyon M, Moore DH. Riluzole for amyotrophic lateral sclerosis (ALS)/motor neuron disease (MND). Cochrane Database Syst Rev 2007:CD001447. 25. Landers JE, Melki J, Meininger V, Glass JD, van den Berg LH, et al. Reduced expression of the Kinesin- Associated Protein 3 (KIFAP3) gene increases survival in sporadic amyotrophic lateral sclerosis. Proc Natl Acad Sci U S A 2009;106:9004-9009.

116 UNC13A IS A MODIFIER OF SURVIVAL IN AMYOTROPHIC LATERAL SCLEROSIS

SUPPLEMENTARY DATA SEQUENCING METHODS Genotyping of the rs12608932 SNP in the independent population-based cohort was carried out by use of capillary sequencing. We used the following primers for PCR amplification: forward 5’ ATGAAATGTTG- GATGAGCAG; reverse 5’ CACACCCACCCATCTAACTAC. Primers were designed using LIMSTILL (http:// limstill.niob.knaw.nl). PCR was carried out using a touchdown thermocycling program (92 °C for 60 s; 15 cycles of 92 °C for 20 s, 65 °C for 30 s with a decrement of 0.5 °C per cycle, 72 °C for 60 s; followed by 30 cycles of 92 °C for 20 s, 58 °C for 30 s and 72 °C for 60 s; 72 °C for 180 s; GeneAmp9700, Applied Biosystems, Foster City, California, USA). PCR reaction consisted of 5 µl amplified DNA (5 ng/µl), 0.2 µM of each prim- er, 200 µM of each dinucleotide triphosphate (dNTP), 25 mM Tricine, 7.0% glycerol (w/v), 1.6% dimethyl sulfoxide (DMSO, w/v), 2 mM MgCl2, 85 mM ammonium acetate pH 8.7 and 0.04 U Taq Polymerase in a total volume of 10 µl.

PCR products were diluted in 20 µl H2O and 1 µl was directly used as template for the sequencing reactions. Sequencing reactions contained 0.1 µl BigDYE (v3.1; Applied Biosystems), 1.99 µl 2.5x dilution buffer (Applied Biosystems) and 0.4 µM of the primer used in the PCR reaction (either forward or reverse) in a total volume of 5 µl. The reactions were performed using cycling conditions as follows: 40 cycles of 92 °C for 10 s, 50 °C for 5 s and 60 °C for 120 s. Sequencing products were purified by ethanol precipitation in the presence of 40 mM sodium-acetate and analyzed on a 96-capillary 3730XL DNA analyzer (Applied Biosystems), using the standard RapidSeq protocol on 36 cm array. Traces were analyzed to determine rs12608932 genotypes using PolyPhred and in-house developed software.

Supplementary Table 1 Results for age at onset analyses 7

ALS Age at onset, yr Cox regression results Dominant, HR Recessive, HR Cohort n mean (range) Additive, HR (p) (p) (p) Population-based 412 61.3 (16-88) 1.07 (0.36) 1.04 (0.69) 1.17 (0.22) GWAS 1767 60.3 (18-88) 1.00 (0.97) 0.97 (0.52) 1.06 (0.37) In the age at onset analyses 5 cases were excluded from the population-based cohort and 263 cases from the GWAS cohort due to missing covariate data. Results are shown for the three genetic models tested. GWAS: genome-wide association study; ALS: amyotrophic lateral sclerosis; HR: hazard ratio.

117

8

GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

MANUSCRIPT IN PREPARATION

Frank P Diekstra, Michael A Nalls, Vivianna M Van Deerlin†, Raffaele Ferrari, Michael A van Es, John C van Swieten†, Peter Heutink, Aleksey Shatunov, Ammar Al-Chalabi, Orla Hardiman, Pamela J Shaw, Karen E Morrison, Philip van Damme, Wim Robberecht, Julie van der Zee, Christine van Broekhoven, Alexis Brice, Isabelle Le Ber, Caroline Graff, Stuart Pickering-Brown, Adriano Chio, Andrew B Singleton, Bryan J Traynor, John Hardy, Rosa Rademakers, Leonard H van den Berg*, Jan H Veldink*

These authors were joint senior authors on this work * on behalf of the International Collaboration for Frontotemporal Lobar † Degeneration; see Appendix for full list of contributors

ABSTRACT A hexanucleotide repeat expansion in C9orf72 is one of the most important genetic causes of both amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). C9orf72 repeat expansion carriers show great phenotypic heterogeneity, ranging from pure motor symptoms to solely behavio- ral or cognitive symptoms. To date, the factors that determine the C9orf72 expansion phenotype are still largely unknown, although there is evidence for genetic variants in TMEM106B (previously implicat- ed in the susceptibility to FTD) and ATXN2 as genetic modifiers in patients with aC9orf72 expansion. We sought to identify additional genetic variants that may act as a genetic switch to determine the onset of either ALS or FTD in C9orf72 repeat expansion carriers. We collected previously published genome-wide data from 477 ALS and 285 FTD patients carrying a C9orf72 expansion, while also adding a new cohort of 69 C9orf72 expansion carriers with ALS. We used genome-wide imputation of genetic variants to increase genome coverage and comparability. Genetic variants were tested using logistic regression models for association with either ALS or FTD. Our study did not yield genome-wide significant associations, but we found a nominally significant association with rs3173615 in TMEM106B and onset of either ALS or FTD (OR 0.74, p=0.015). In a similar candidate gene approach, we found a nominally significant association with rs11065979 in ATXN2 and C9orf72 disease phenotype (OR 1.35, p=0.009). Although our dataset provides a large and unique set of both ALS and FTD cases all carrying a C9orf72 repeat expansion, statistical power was limited to discover associations with small effect sizes (OR < 2). Therefore, further international collaboration will be needed to increase sample size and power.

INTRODUCTION Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterized by the loss of motor neurons in both brain and spinal cord leading to progressive muscle weakness. There is no cure for the dis- ease and patients typically die due to respiratory insufficiency, with a median survival time of approximately three years.1 Approximately 50 percent of ALS patients have a certain degree of cognitive symptoms, while about 5-15 percent are diagnosed with frontotemporal dementia (FTD) in the course of the disease.2-5 Conversely, about 3-14 percent of FTD cases develop motor neuron symptoms.3 Frontotemporal dementia is a relatively rare form of dementia characterized by changes in cog- nition, behavior and language. Although the population incidence rates are relatively low (approximately 3/100,000 per year), FTD is the second most common form of dementia under the age of 65 years.6, 7 Besides the clinical coincidence of ALS and FTD, the two diseases have important neuropathologi- cal and genetic overlap. The two major subtypes of FTD are characterized by cellular inclusion of either tau (FTD-tau) or TDP–43 (FTD-TDP). TDP–43 inclusions have been identified in neurons of both ALS and FTD- TDP patients.8 Genetic studies have found mutations in VCP and FUS in both diseases, but the discovery of

120 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS a hexanucleotide repeat expansion in an intron of the C9orf72 gene on chromosome 9p21.2 provides the strongest genetic link between the two clinical entities.9-12 The chromosome 9p21.2 locus was first identified in linkage studies in families with ALS and FTD cases.13, 14 Subsequently, GWAS have been able to fine map the locus to three genes, of whichC9orf72 has been discovered to harbor the causal variant.9, 10, 15-17 Approximately 6% of sporadic ALS and FTD patients carry the expanded C9orf72 repeat, while in familial ALS and FTD this percentage reaches 37% and 25%, respectively.18 Patients carrying the C9orf72 repeat expansion have a considerable phenotypic heterogeneity, ranging from pure motor neuron symptoms to a cognitive/behavioral phenotype. Previous studies have looked at genetic variants in genes that have been implicated in ALS or FTD susceptibility, and compared their allele frequencies to control subjects. In the present study we sought to identify genetic modifiers acting as a switch to determine the onset of either FTD or ALS within C9orf72 repeat expansion carriers. Finding such ‘genetic switches’ may have a large impact on genetic counseling of asymptomatic C9orf72 repeat expansion carriers, and the identification of these switches might provide more insight into the path- ways involved in the pathogenesis of both ALS and FTD. We, therefore, conducted a genome-wide analysis of patients with C9orf72 expansions, directly comparing ALS patients to subjects with FTD.

SUBJECTS AND METHODS SUBJECTS ALS and FTD subjects were obtained from available and previously published GWASs of ALS or FTD pa- tients.15-17, 19-26 We included 31 cohorts from The Netherlands, Belgium, France, Italy, Germany, United King- dom, Ireland, Sweden, Finland, United States and Canada. All individuals were of Caucasian descent. We excluded cohorts that had genotypes for selected SNP sets only (for example using an Illumina NeuroX custom chip). All ALS patients were diagnosed with probable or definite ALS according to the revised El Escorial criteria.27 For the FTD-TDP cohort, FTD diagnosis was confirmed by TDP–43 immunohistochem- istry, while FTD cases in other cohorts were included based on a clinical diagnosis according to the Neary criteria.28 Additionally, we included a newly genotyped cohort of 1,226 ALS patients from the Netherlands. 8 Cases were diagnosed with probable or definite ALS according to the revised El Escorial Criteria by neurol- ogists specialized in motor neuron diseases. Tertiary referral centers for ALS were University Medical Center Utrecht, Academic Medical Centre Amsterdam and Radboud University Medical Center Nijmegen. We only included C9orf72 repeat expansion carriers, diagnosed with either ALS or FTD or both. C9orf72 repeat expansion status was determined at submitting sites by repeat-primed PCR or southern blot, as has been described previously.9, 10, 29 All participants gave written informed consent and approval was obtained from the local institu- tional review boards. More detailed information on ALS or FTD subject selection methods has been pub- lished previously.15-17,19-25

121 Diagnoses of ALS, ALS-FTD or FTD were made by neurologists in specialized neurological centers. As noted previously, the diagnosis of ALS may coincide with FTD and vice versa. In the present study we aimed at identifying genetic variants that would influence the course of disease inC9orf72 repeat expansion carriers. Therefore, for the categorization of patients ultimately having both ALS and FTD, we used the predominant clinical phenotype (either ALS or FTD) at disease onset, as was provided by the clinicians at the collabo- rating sites.

GENOTYPES AND QUALITY CONTROL All cohorts were genotyped using Illumina BeadChip arrays. We formed strata based on array version and study, while trying to maximize stratum size for quality control procedures. Per stratum, we removed SNPs that could cause allele swaps (tri-allelic SNPs, A/T or C/G SNPs), SNPs that were not present in dbSNP137, SNPs with a genotyping call rate < 95%, SNPs with a minor allele frequency (MAF) in the stratum < 1%, or with a MAF in the 1000 Genomes project < 5%, SNPs strongly deviating from Hardy- Weinberg Equilibrium (p < 1×10–6), or SNPs where genotypes had differing missing rates between flanking haplotypes (PLINK mishap test p < 1×10–5). Samples that had a genotyping call rate < 95%, with high or low heterozygosity rates (± 3 standard deviations (SD) from mean inbreeding coefficient (F value) per stratum), or where the genetic gender did not match the gender in the phenotype file, were removed. For the purpose of additional quality control, we formed a merged set of all samples using only SNPs overlapping all cohorts. We removed duplicate samples (PI_HAT > 0.9) and related samples (PI_HAT > 0.125), where for duplicated or related sample pairs only the sample with the lowest call rate was re- moved. Samples were merged with HapMap Phase 3 version 3 individuals, and using EIGENSTRAT v5.0 a principal components analysis (PCA) was conducted to identify population substructure.30 Population outliers were defined by deviating with ± 10 SD from the mean of the PCA values of the HapMap ‘EUR’ and ‘TSI’ populations for each of the first 4 principal components. All quality control procedures were carried out using PLINK31 and R (www.r-project.org).

GENOME-WIDE SNP IMPUTATION In order to increase comparability and coverage, we performed genome-wide SNP imputation using the 1000 genomes Phase I Integrated Release Version 3 reference panel and MaCH v1.0 software.32 Imputa- tion was carried out per stratum. Datasets were split into chromosomes, and subsequently, chromosomes were split into chunks of 2500 SNPs with a 500-SNP overlap. Imputation was parallelized by using a prephasing step (MaCH) and a genotype imputation step (minimac), as described previously.33 All program parameters were left at their default values. Imputed genotypes were stored as continuous allele dosage data, which are continuous numerical values indicating the estimated number of minor alleles (ranging from 0 to 2). Only SNPs with a MAF > 0.01 and an imputation quality score threshold (r2) of > 0.6 were included for further analyses.

122 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

ASSOCIATION ANALYSIS For association analyses between SNP genotypes and susceptibility to either ALS or FTD, we opted to pool all strata into a single analysis. A weighted meta-analysis of strata was not feasible because of the lack of balanced strata and little power of the relatively small stratum sizes. In the pooled analysis only SNPs for which genotype data were present in all strata were included. We performed logistic regression in PLINK, correcting for gender and the first two principal com- ponents, which were strongly (p < 1×10–5) associated with phenotypic outcome.

RESULTS After quality control, there were 15 strata with a total of 402 ALS patients and 253 FTD patients (Table 1). See Supplementary Table 1 for more details on quality control results. There were 134,258 SNPs with genotypes present across all strata, which we used for population outlier detection and re- latedness checks (Supplementary Figure 1). We performed genome-wide SNP imputation per stra- tum using the 1000 genomes phase I reference panel. Because of the limited study sample size, and in order to prevent spurious association signals, very low-frequency SNPs (MAF < 0.01) were ex- cluded. Also, in our pooled analysis we only analyzed SNPs that had genotypes in all strata, ultimate- ly leaving 3,432,133 SNPs for analysis. We tested SNP genotypes for association with suscepti- bility to either ALS or FTD using logistic regression analysis. Figure 1 shows a Manhattan plot of association results. The genomic inflation factor was 1.045, indicating adequate quality control. A quan- tile-quantile plot is shown in Supplementary Figure 2. The top ten most significant hits are shown in Table 2. We found no SNP markers reaching genome-wide significance p( < 5×10-8).

Table 1 Stratum details after quality control

Stratum name Reference n SNPs n ALS n FTD ALS_BE1 van Es, 200916 310485 28 0 ALS_IR1 Cronin, 200821 454297 8 0 ALS_IR2 van Es, 200916 492023 13 0 22 ALS_IT1 Chiò, 2009 472105 12 0 ALS_IT2 Traynor, 201023 465902 7 0 8 ALS_NL1 van Es, 200720 296453 26 0 ALS_NL2 van Es, 200916 313509 20 0 ALS_UK2 Shatunov, 201015 505955 42 0 FTD_TDP Van Deerlin, 201024 486937 6 86 ALS_FI1 Laaksovirta, 201017 308211 86 0 ALS_NIH Johnson, 201426 582676 92 0 ALS_US1 Schymick, 200719 471685 16 0 FTD_CLI1 Ferrari, 201425 485427 1 140 FTD_CLI2 Ferrari, 201425 580927 2 27 ALS_NL3 Present study 572333 43 0 Total 402 253

123 Subsequently, we specifically investigated SNPs in genes that have previously been implicated as disease modifiers inC9orf72 repeat expansion carriers. For rs3173615 in gene TMEM106B, van Blitterswijk et al. compared minor allele frequencies in ALS or FTD patients with C9orf72 expansions to unaffected controls.34 They found a decreased minor allele frequency for FTD patients (35.5% vs. 43.2%, Cohort 1), in particular under a recessive genetic model (OR 0.33, p=0.009), while this effect was not significant inC9orf72 repeat expansion carriers with ALS (OR 0.85, p=0.55). Similar to the findings of van Blitterswijk et al., we found a lower minor allele frequency for rs3173615 in FTD cases with C9orf72 expansions compared to ALS cases (OR 0.74, p=0.015). Another SNP in TMEM106B (rs57506017) showed a stronger signal (OR 0.65, p=0.002). See Table 3 for more detailed results.

Figure 1 Manhattan plot

Each dot represents a single nucleotide polymorphism; -log10 p values are shown on the y-axis, and chromo- somal positions on the x-axis. Chromosomes are numbered along the x-axis and are designated by changing colors. The threshold for genome-wide significance (p < 5 x 10-8) is indicated by a dotted line.

Table 2 Top 10 association loci from genome-wide analysis

Locus Nearest gene n SNPs Top SNP Minor allele MAF OR p 3p25.1 COLQ 11 rs73146147 C 0.11 0.41 2.60×10-6 12q23.2 C12orf42 19 rs1401994 T 0.47 0.56 4.32×10-6 8p23.1 RP1L1 4 rs10089537 G 0.44 1.79 5.18×10-6 22q13.31 LDOC1L 3 rs135918 A 0.48 0.53 5.90×10-6 22q13.1 TRIOBP 1 rs5750482 T 0.42 0.60 2.15×10-5 7p15.3 DNAH11 4 rs115529292 C 0.21 0.52 6.83×10-6 4p15.2 KCNIP4 1 rs199767347 del 0.45 1.79 9.12×10-6 1q41 HHIPL2 1 rs35763770 A 0.28 0.53 9.39×10-6 3p21.31 SACM1L 1 rs2673050 T 0.44 1.74 1.18×10-5 4q25 - 2 rs2443054 T 0.31 0.54 1.22×10-5 Per locus, the number of SNPs with p < 1×10-4 is indicated, and association results for the SNP with the most significant p-value (top SNP) are presented. MAF = weighted minor allele frequency across all datasets; OR = odds ratio; del = deletion.

124 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

Table 3 Association results for loci previously identified as disease modifiers in C9orf72 repeat expansion carriers

Locus SNP Minor allele MAF FTD MAF ALS OR p value TMEM106B rs3173615 G 0.368 0.399 0.74 0.015 TMEM106B rs57506017 T 0.227 0.302 0.65 0.002 ATXN2 rs11065979 T 0.472 0.389 1.38 0.009 An odds ratio less than 1 indicates that the minor allele was less frequent in FTD compared to ALS C9orf72 repeat expansion carriers. MAF = minor allele frequency, OR = odds ratio.

Furthermore, previously an intermediate-length polyglutamate repeat in ATXN2 was more frequently en- countered in C9orf72 expansion carriers with ALS/ALS-FTD than in expansion carriers with pure FTD.35 We, therefore, explored SNPs in the ATXN2 locus for association with an ALS or FTD phenotype in our dataset. We found the most significant association with rs11065979, located within a region of strong LD comprising ATXN2 (OR 1.38, p=0.009, Table 3). The minor allele was associated with an increased risk for the FTD phenotype.

DISCUSSION In the present study we aimed at identifying genetic modifiers that would determine the onset of either ALS or FTD in C9orf72 repeat expansion carriers. Therefore, we have collected the largest genome-wide data set of C9orf72 repeat expansion carriers with ALS or FTD. We applied stringent quality control to the raw genotype data and performed genome-wide SNP imputation using the 1000 genomes reference panel, yielding over 3 million SNPs with high accuracy (i.e. MaCH imputation r2 > 0.9). Our analysis did not yield any genome-wide significant association signals. Additionally, in a candidate gene approach, we were able to confirm previously identified disease modifiers (inTMEM106B and ATXN2) in C9orf72 repeat expansion carriers. The TMEM106B locus was first identified as a risk locus for FTD in a GWAS of FTD patients with neuronal TDP-43 inclusions (FTD-TDP).24 Further studies have confirmed this association and, additionally, implicated genetic variants in TMEM106B as a risk factor for cognitive impairment in ALS patients.36-38 More 8 recently, TMEM106B has been identified as an important disease modifier in subjects with C9orf72 repeat expansions, modifying age at onset, age at death and the risk of developing FTD.34, 39 Our study confirms the previous finding that inC9orf72 expansion carriers, the minor allele of rs3173615 is less frequent in FTD patients than in patients with ALS. TMEM106B is a transmembrane protein, predominantly localized at the lysosomal membrane, where it may regulate lysosomal size, motility and responsiveness to stress.40 TMEM106B may interact with MAP6, which is necessary for normal dendritic trafficking of lysosomes, implicating lysosomal biology in the pathogenesis of the ALS-FTD spectrum.41 As noted previously, intermediate-length repeat expansions in the Ataxin-2 gene (ATXN2) may modify the disease phenotype in C9orf72 repeat expansion carriers, conferring an increased risk of the

125 ALS phenotype.35 In the present study, we did not have ATXN2 repeat lengths available, but we used SNP genotypes as a by-proxy attempt instead. We found a nominally significant association with rs11065979, of which the major allele was associated with an increased risk of ALS compared to FTD. Further studies, measuring ATXN2 repeat lengths in differentC9orf72 phenotypes, are needed to definitively establish the role of ATXN2 as a disease modifier inC9orf72 repeat expansion carriers. For the present study we collected a large and unique set of C9orf72 repeat expansion carriers through international collaboration. The uniformity of our study dataset (i.e. all patients share the same ge- netic aberration in C9orf72) forms an important means to increase statistical power. However, for a GWAS, the numbers of ALS and FTD patients we were able to include are still small. We might have failed to iden- tify an association with genome-wide significance due to the lack of statistical power. Power calculations estimated approximately 80% power for the detection of an association with odds ratio 2.0, minor allele frequency 0.4 at α = 5×10-8. However, power dropped dramatically for smaller effect sizes. For example, we had about 5% power for an association with OR 1.5. We have collected the largest sample of C9orf72 repeat expansion carriers with genome-wide SNP data, but future efforts should be made, through inter- national collaboration, and also applying linear mixed model association techniques42 to further explore the C9orf72 genetic switch hypothesis. We aimed to categorize C9orf72 expansion carriers according to ALS or FTD phenotype as ac- curately as possible, using the most reliable and most clinically relevant predominant symptom at disease onset. However, as previously noted in literature, ALS and FTD may form parts of a spectrum.3 Therefore, while the pure motor and pure FTD phenotypes will have been accurately categorized, recollection bias might have lead to inaccurate categorization of some ALS-FTD cases (there were 29 ALS-FTD cases in our dataset, 4.4%). This might have lead to a decrease in statistical power. Another explanation for the lack of any genome-wide significant hits in our study might be that the phenotypic heterogeneity amongst C9orf72 repeat expansion carriers is more strongly modified by epigenetic or environmental factors. A previous study found that hypermethylation of the C9orf72 promoter region was associated with longer survival, but methylation was not different between ALS and FTD repeat expansion carriers.43 Moreover, a somatic heterogeneity of C9orf72 repeat expansions in different brain regions has been hypothesized as a possible determinant for either developing ALS or FTD, although this could not be confirmed in another study.44 As a logical step in the study of a repeat-related disorder, several studies have investigated whether C9orf72 repeat expansion size is associated with either the ALS or FTD phenotype. However, such an association has not been established so far, although C9orf72 repeat number, especially in the cerebellum, might determine survival duration from onset in either ALS or FTD.44-46 In conclusion, our study did not identify new genetic variants that determine an ALS or FTD pheno- type in C9orf72 repeat expansion carriers. Although we collected the largest available sample of C9orf72 ex- pansion carriers with genome-wide data, lack of statistical power might explain why we could not detect any genome-wide significant associations. We replicated the association with TMEM106B as a disease modifier using a candidate gene approach, further corroborating it’s role in the spectrum of ALS and FTD. Further in- ternational collaboration will be needed to improve sample size and statistical power in order to detect ad- ditional genetic modifiers that may explain phenotypic heterogeneity in C9orf72 repeat expansion carriers.

126 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

REFERENCES 1. Testa D, Lovati R, Ferrarini M, Salmoiraghi F, Filippini G. Survival of 793 patients with amyotrophic lateral sclerosis diagnosed over a 28-year period. Amyotroph Lateral Scler Other Motor Neuron Disord 2004;5:208-212. 2. Lomen-Hoerth C, Murphy J, Langmore S, Kramer JH, Olney RK, et al. Are amyotrophic lateral sclerosis patients cognitively normal? Neurology 2003;60:1094-1097. 3. Murphy J, Henry R, Lomen-Hoerth C. Establishing subtypes of the continuum of frontal lobe impair- ment in amyotrophic lateral sclerosis. Arch Neurol 2007;64:330-334. 4. Ringholz GM, Appel SH, Bradshaw M, Cooke NA, Mosnik DM, et al. Prevalence and patterns of cognitive impairment in sporadic ALS. Neurology 2005;65:586-590. 5. Phukan J, Pender NP, Hardiman O. Cognitive impairment in amyotrophic lateral sclerosis. Lancet Neurol 2007;6:994-1003. 6. Ng ASL, Rademakers R, Miller BL. Frontotemporal dementia: a bridge between dementia and neuromuscular disease. Ann N Y Acad Sci 2014;1338:71-93. 7. Onyike CU, Diehl-Schmid J. The epidemiology of frontotemporal dementia. Int Rev Psychiatry 2013;25:130-137. 8. Brettschneider J, Del Tredici K, Toledo JB, Robinson JL, Irwin DJ, et al. Stages of pTDP-43 patholo- gy in amyotrophic lateral sclerosis. Ann Neurol 2013;74:20-38. 9. Renton AE, Majounie E, Waite A, Simón-Sánchez J, Rollinson S, et al. A hexanucleotide repeat ex pansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 2011;72:257- 268. 10. Dejesus-Hernandez M, Mackenzie IR, Boeve BF, Boxer AL, Baker M, et al. Expanded GGGGCC hex anucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 2011;72:245-256. 11. Johnson JO, Mandrioli J, Benatar M, Abramzon Y, Van Deerlin VM, et al. Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron 2010;68:857-864. 12. Mackenzie IR, Rademakers R, Neumann M. TDP-43 and FUS in amyotrophic lateral sclerosis and frontotemporal dementia. Lancet Neurol 2010;9:995-1007. 13. Le Ber I, Camuzat A, Berger E, Hannequin D, Laquerriere A, et al. Chromosome 9p-linked families with 8 frontotemporal dementia associated with motor neuron disease. Neurology 2009;72:1669-1676. 14. Vance C, Al-Chalabi A, Ruddy D, Smith BN, Hu X, et al. Familial amyotrophic lateral sclerosis with frontotemporal dementia is linked to a locus on chromosome 9p13.2-21.3. Brain 2006;129:868-876. 15. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 16. van Es MA, Veldink JH, Saris CG, Blauw HM, van Vught PW, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087.

127 17. Laaksovirta H, Peuralinna T, Schymick JC, Scholz SW, Lai SL, et al. Chromosome 9p21 in amyotrophic lateral sclerosis in Finland: a genome-wide association study. Lancet Neurol 2010;9:978-985. 18. Majounie E, Renton AE, Mok K, Dopper EG, Waite A, et al. Frequency of the C9orf72 hexanucleotide repeat expansion in patients with amyotrophic lateral sclerosis and frontotemporal dementia: a cross- sectional study. Lancet Neurol 2012;11:323-330. 19. Schymick JC, Scholz SW, Fung HC, Britton A, Arepalli S, et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data.Lancet Neurol 2007;6:322-328. 20. van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol 2007;6:869-877. 21. Cronin S, Berger S, Ding J, Schymick JC, Washecka N, et al. A genome-wide association study of sporadic ALS in a homogenous Irish population. Hum Mol Genet 2008;17:768-774. 22. Chio A, Schymick JC, Restagno G, Scholz SW, Lombardo F, et al. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. Hum Mol Genet 2009;18:1524-1532. 23. Traynor BJ, Nalls M, Lai SL, Gibbs RJ, Schymick JC, et al. Kinesin-associated protein 3 (KIFAP3) has no effect on survival in a population-based cohort of ALS patients.Proc Natl Acad Sci U S A 2010;107:12335- 12338. 24. Van Deerlin VM, Sleiman PM, Martinez-Lage M, Chen-Plotkin A, Wang LS, et al. Common variants at 7p21 are associated with frontotemporal lobar degeneration with TDP-43 inclusions. Nat Genet 2010;42:234- 239. 25. Ferrari R, Hernandez DG, Nalls MA, Rohrer JD, Ramasamy A, et al. Frontotemporal dementia and its subtypes: a genome-wide association study. Lancet Neurol 2014;13:686-699. 26. Johnson JO, Pioro EP, Boehringer A, Chia R, Feit H, et al. Mutations in the Matrin 3 gene cause familial amyotrophic lateral sclerosis. Nat Neurosci 2014;17:664-666. 27. Brooks BR, Miller RG, Swash M, Munsat TL, Diseases WFoNRGoMN. El Escorial revisited: revised criteria for the diagnosis of amyotrophic lateral sclerosis. Amyotroph Lateral Scler Other Motor Neuron Disord 2000;1:293-299. 28. Neary D, Snowden JS, Gustafson L, Passant U, Stuss D, et al. Frontotemporal lobar degeneration: a consensus on clinical diagnostic criteria. Neurology 1998;51:1546-1554. 29. van der Zee J, Gijselinck I, Dillen L, Van Langenhove T, Theuns J, et al. A pan-European study of the C9orf72 repeat associated with FTLD: geographic prevalence, genomic instability, and intermediate repeats. Hum Mutat 2013;34:363-373. 30. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, et al. Principal components analysis corrects for stratification in genome-wide association studies.Nat Genet 2006;38:904-909. 31. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, et al. PLINK: a tool set for whole-genome- association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. 32. Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 2010;34:816-834. 33. Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR. Fast and accurate genotype imputation in

128 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

genome-wide association studies through pre-phasing. Nat Genet 2012;44:955-959. 34. Van Blitterswijk M, Mullen B, Nicholson AM, Bieniek KF, Heckman MG, et al. TMEM106B protects C9ORF72 expansion carriers against frontotemporal dementia. Acta Neuropathol 2014;127:397-406. 35. van Blitterswijk M, Mullen B, Heckman MG, Baker MC, DeJesus-Hernandez M, et al. Ataxin-2 as potential disease modifier in C9ORF72 expansion carriers.Neurobiol Aging 2014;35:2421 e2413-2427. 36. Cruchaga C, Graff C, Chiang HH, Wang J, Hinrichs AL, et al. Association of TMEM106B gene polymorphism with age at onset in granulin mutation carriers and plasma granulin protein levels. Arch Neurol 2011;68:581-586. 37. van der Zee J, Van Langenhove T, Kleinberger G, Sleegers K, Engelborghs S, et al. TMEM106B is associated with frontotemporal lobar degeneration in a clinically diagnosed patient cohort. Brain 2011;134:808- 815. 38. Vass R, Ashbridge E, Geser F, Hu WT, Grossman M, et al. Risk genotypes at TMEM106B are associated with cognitive impairment in amyotrophic lateral sclerosis. Acta Neuropathol 2011;121:373-380. 39. Gallagher MD, Suh E, Grossman M, Elman L, Mccluskey L, et al. TMEM106B is a genetic modifier of fronto temporal lobar degeneration with C9orf72 hexanucleotide repeat expansions. Acta Neuropathol 2014;127:407-418. 40. Brady OA, Zheng Y, Murphy K, Huang M, Hu F. The frontotemporal lobar degeneration risk factor, TMEM106B, regulates lysosomal morphology and function. Hum Mol Genet 2013;22:685-695. 41. Debaisieux S, Schiavo G. TiME for TMEM106B. EMBO J 2014;33:405-406. 42. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 2011;88:76-82. 43. Russ J, Liu EY, Wu K, Neal D, Suh E, et al. Hypermethylation of repeat expanded C9orf72 is a clinical and molecular disease modifier.Acta Neuropathol 2015;129:39-52. 44. van Blitterswijk M, DeJesus-Hernandez M, Niemantsverdriet E, Murray ME, Heckman MG, et al. Association between repeat sizes and clinical and pathological characteristics in carriers of C9ORF72 repeat expan- sions (Xpansize-72): a cross-sectional cohort study. Lancet Neurol 2013;12:978-988. 45. Benussi L, Rossi G, Glionna M, Tonoli E, Piccoli E, et al. C9ORF72 hexanucleotide repeat number in frontotemporal lobar degeneration: a genotype-phenotype correlation study. J Alzheimers Dis 2014;38:799-808. 46. Dols-Icardo O, Garcia-Redondo A, Rojas-Garcia R, Sanchez-Valle R, Noguera A, et al. Characterization 8 of the repeat expansion size in C9orf72 in amyotrophic lateral sclerosis and frontotemporal dementia. Hum Mol Genet 2014;23:749-754.

129 SUPPLEMENTARY INFORMATION Supplementary Table 1 Cohorts, strata and quality control metrics Q To t The U U B I C G I U G I U U I F The I I I S U M F M U The The I I I I B C C N

t t t t t t t t t r r r i a a a a a a a a a e e w anad a o oho r e e e e n n n n n n n o n C u u anc e l l l l l l l l l l l u l l i i r i r i i i i n l l l e

y y y y y y y y y a g g an d an d t t t t t t t t t an d m m n = - e e e e e e e i i d

i i N N N N l p p u u a

t

d d d d d d d an y an y e r q t e e e e l l m m u

e e y s n

u

t t t t t S S S K K S K

he r he r he r he r o

a i i i t t t t

s ng d ng d ng d a a a a li t o t t t t l l l l e e e e y m and s and s and s and s

s s s s o o o . con t

m m m =

no n

r o F F S L F F F F F F F F F T C v P F F v R F F F F F F J S v C v V ohn s an an an an e e e e e e e e e e e e e e e e e e e l; aa k r r - chy m ha t e h r an a e a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a rr a on i i f ò y u s e 'ba d' Es , Es , Es , Es , en t uno v no r t p D r s , o

r r r r r r r r r r r r r r r r r r r n on , enc e o l 2 ee r i i i i i i i i i i i i i i i i i i i a s i , , , , , , , , , , , , , , , , , , , ,

v

ck ,

00 9 o s 20 0 20 0 20 0 20 0 t 2 , i 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

f r m

S , 2 2 00 8 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 t t

li n u

a 2 0 0 N a r

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 d 2 , m 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 l 9 9 7 9 1 1

, P 00 7 2

y 0 4 20 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 16 16 16

S

2 s 2 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

0 0 2 2 1

N 0 6

3 = 15

1 0 P 19 t

0 s

17 r 2 ; i

4 - D

a u ll e p Om n Ill u 55 0 6 55 0 Om n 66 0 66 0 66 0 66 0 66 0 Om n 6 3 55 0 6 3 Om n 66 0 66 0 66 0 66 0 66 0 66 0 66 0 66 0 66 0 66 0 66 0 66 0 3 6 3 platform l li c

1 1 1 1 1 7 7 7 = 0 0 0 0 7 0 0 0

m

K K K K K d K K K S K K K K K K K K K K K K K K K K K K K K

i i i i i ,

u

X X X X na

N 55 0 p P p p p P li c r r r r s e e e e ,

a K s s s s A t s s s s

/ e

T

s

o a r m 6 4 7 1 7 1 1 0 0 0 1 0 1 4 3 8 A P 9 0 0 0 0 0 0 0 0 0 0 0 0 1 4 2 1 2 6 0 4

1

8 2 0 8 r

6 0

L

C p

8 e

S 1

/ l

e G

Q s

; C S

9 28 1 0 0 0 F 1 9 4 1 1 8 0 0 0 0 1 2 5 2 5 1 2 2 2 4 1 2 0 0 0 1 0 0 Pop .

N 6 6 9 0

3 T

8 0 5 1

P D

s

,

o S u

55 1 56 1 55934 8 7 55535 2 545 0 S 7 56 1 56 1 56 1 56 1 56 1 7 5844 1 3 56 1 56 1 56 1 56 1 56 1 56 1 56 1 56 1 56 1 56 1 56 1 3 62 0 3399 1 7 3 62 0 N t 1 3 305 2 3 1 7 7 li e N P 966 5 7 1 0 1 0 s P 44 2 5 44 2 7 49 0 49 0 49 0 49 0 49 0 49 0 46 6 49 0 49 0 49 0 49 0 49 0 49 0 49 0 49 0 49 0 49 0 r 4 4

9 9 s 6 0

no t 0 0 6

0 0 = 7 4 3 0 5 6 4 4

1 1

S

pop u

N p

F A A A S F F A A A A A A A A A P r T T T L L L L L L L L L L L L t e s r D D D S S S S S S S S S S S S s

a l _ _ _ a en t _N L _ _U S _N I _U K _N L _ _N L _ _ _ _ t u t T C C I I I F B I i T R T R m on I D L L E r 1

1 2 1 2 H

a

i I I P 1

3 1 2

2 1 n 2 1

t

o

e db S u

t li e

N rs P

3 8 38 7 1 494 1 1 7 22 0 384 8 Q 358 3 3 47 0 3 3668 7 3 SNPs 'B a 036 4 0 1 6 1 0 0 7 4 a C 9 1 1 1 6 37 ; 59 2 59 2 s 7

3 3 0 d' 0

7 3 pe r 8 4

2 de t

M

e A s r t 1

265 8 6 3 556 4 8 296 7 2 247 3 9 rate C 396 2 49 7 1 29 0 27 1 564 4 F r K 4 m

9 77 4 0 0 0 a a 5 = G i

ll 6 8 5 t ned 0

2 u

m 8 0 9 7

m

i no r :

S b N y

429 1 266 0 44 0 1 437 8 1 85 8 46 08 3 M 1 963 0 56 0 56 7 635 8 6 3 0 56 4 7

a 1 1 P p A 0 0 6 4 ll e s r F 7 0 3 1

6 2 i

9

0 7 3 n 0 4 9 l

3 6

8

e

c

fr eq u i pa l

7 3844 9 7 3923 3 1KG M 7 43 0 685 1 382 7 3 3 1 1 1 4362 2 4366 2 m 08 7 086 9 08 7 59 4 69 7 69 1 7 7

A i c ss . 69 5 8 F o 0 enc y 1 6

0 0 9 m

8 7 7 5

ponen t a u ; 1

0 0 0 5 7 0 1 0 H 0 1 1 1 0 0 0 t 7

2

o 2 2 K

W

1 s

G

o E

s m

=

ana l . 1

000

0 0 0 H miss. 0 3 0 0 0 0 1 0 0 0 0 0 0 y 6

ap . s 0 r

i a G

s t

. eno m e

1 11 42 0 1 11 2 11 7 1 1 86 7 1 autosom. N 94 1 928 3 11 2 1 9 11 66 6 359 0 3 4 2 068 7 09 4 11 3 o 09 6 1 7 zygo s 0 3 6 8 n- 8 4 e 5 3 6 1 7

4 0

s

;

H i W t y

E

=

e

0 1 0 0 6 6 0 0 0 Q 0 rate C 0 0 1 0 0 H

r

a C a

ll

r pe r d y -

0 0 3 1 2 zygosity 4 0 1 0 H 0 1 0 0 2 0

W

s

e t e t r e a i n r t o- b u e m e d r :

g

s

a E

0 3 0 0 0 0 0 0 er G 1 0 0 0 0 0 0 m

q

en d u o p ili b u l e tli e s r

0 3 0 0 1 0 2 0 D 1 2 0 1 0 1 0 i 6

r u

1

u

m p ; l

H a 4

0 0 1 5 0 1 0 ed R 3 4 0 1 0 7 0

p

e

.

l a m t i ss . 2

0 0 1 0 0 0 0 outlier Pop . 1 0 0 0 0 0 0

=

m i ss i ng

4 1 1 2 1 9 4 2 8 A 2 A 4 7 2 6 1 8 b 6 2

3

0 2 3 6 0 8 2

6 ft e L y

S

2

ha p

r

Q 25 3 0 0 2 0 F 0 0 0 0 1 0 0 0 8 0 0 l C 4 T o 7 6

0

D t

y

p e ;

47 1 472 1 58 0 3 S 5826 7 5 29645 3 45429 7 4 3 5 4659 0 4 492 0 3 8 8 0 7 1 1 0 N 35 0 0 233 3 542 7 693 7 595 5 82 1 P 6 48 5 92 7 s 0 8 2

5 5 9 1 6 3 2

130 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

Supplementary Figure 1 Detection of population outliers

Plot for the detection of population outliers based on a principal components analysis in EIGENSTRAT. The first two principal components are plotted. HapMap Phase 3 release 3 reference samples (CEU, TSI, ASW, MKK, YRI, LWK, GIH, MEX, JPT, CHB, CHD populations) are shown as colored diamonds (green, red, yellow and blue). Study samples are designated by colored circles. Samples marked with ‘×’ were identified as population outliers and were marked for removal.

Supplementary Figure 2 Quantile-quantile plot

8

Quantile-quantile plot for association results after imputation. The genomic inflation factor (λGC) is shown at the bottom right.

131 APPENDIX THE INTERNATIONAL COLLABORATION FOR FRONTOTEMPORAL DEMENTIA Vivianna M Van Deerlin, Patrick M A Sleiman, Maria Martinez-Lage, Alice Chen-Plotkin, Li-San Wang, Neill R Graff-Radford, Dennis W Dickson, Rosa Rademakers, Bradley F Boeve, Murray Grossman, Steven E Arnold, David M A Mann, Stuart M Pickering-Brown, Harro Seelaar, Peter Heutink, John C van Swieten, Jill R Murrell, Bernardino Ghetti, Salvatore Spina, Jordan Grafman, John Hodges, Maria Grazia Spillantini, Sid Gilman, Andrew P Lieberman, Jeffrey A Kaye, Randall L Woltjer, Eileen H Bigio, Marsel Mesulam, Safa al-Sarraj, Claire Troakes, Roger N Rosenberg, Charles L White III, Isidro Ferrer, Albert Lladó, Manuela Neumann, Hans A Kretzschmar, Christine Marie Hulette, Kathleen A Welsh-Bohmer, Bruce L Miller, Ainhoa Alzualde, Adolfo Lopez de Munain, Ann C McKee, Marla Gearing, Allan I Levey, James J Lah, John Hardy, Jonathan D Rohrer, Tammaryn Lashley, Ian R A Mackenzie, Howard H Feldman, Ronald L Hamilton, Steven T Dekosky, Julie van der Zee, Samir Kumar-Singh, Christine Van Broeckhoven, Richard Mayeux, Jean Paul G Vonsattel, Juan C Troncoso, Jillian J Kril, John B J Kwok, Glenda M Halliday, Thomas D Bird, Paul G Ince, Pamela J Shaw, Nigel J Cairns, John C Morris, Catriona Ann McLean, Charles DeCarli, William G Ellis, Stefanie H Freeman, Matthew P Frosch, John H Growdon, Daniel P Perl, Mary Sano, David A Bennett, Julie A Schneider, Thomas G Beach, Eric M Reiman, Bryan K Woodruff, Jeffrey Cummings, Harry V Vinters, Carol A Miller, Helena C Chui, Irina Alafuzoff, Päivi Hartikainen, Danielle Seilhean, Douglas Galasko, Eliezer Masliah, Carl W Cotman, M Teresa Tuñón, M Cristina Caballero Martínez, David G Munoz, Steven L Carroll, Daniel Marson, Peter F Riederer, Nenad Bogdanovic, Gerard D Schellenberg, Hakon Hakonarson, John Q Trojanowski, Virginia M-Y Lee.

THE SLAGEN CONSORTIUM Isabella Fogh, Antonia Ratti, Cinzia Gellera, Kuang Lin, Cinzia Tiloca, Valentina Moskvina, Lucia Corrado, Gianni Sorarù, Cristina Cereda, Stefania Corti, Davide Gentilini, Daniela Calini, Barbara Castellotti, Letizia Mazzini, Giorgia Querin, Stella Gagliardi, Roberto Del Bo, Francesca Luisa Conforti, Cosenza, Gabriele Siciliano, Maurizio Inghilleri, Francesco Saccà, Paolo Bongioanni, Silvana Penco, Massimo Corbo, Sandro Sorbi, Massimiliano Filosto, Alessandra Ferlini, Anna Maria Di Blasio, Stefano Signorini, Nicola Ticozzi, Mauro Ceroni, Elena Pegoraro, Giacomo P Comi, Sandra D’Alfonso, Franco Taroni, Ammar Al-Chalabi, John Powell and Vincenzo Silani.

132 GENETIC MODIFIERS IN C9ORF72 REPEAT EXPANSION CARRIERS: A GENOME-WIDE ANALYSIS

8

133

9

SUMMARY AND GENERAL DISCUSSION SUMMARY AND GENERAL DISCUSSION The aim of this thesis was to identify genetic susceptibility factors in ALS. In order to achieve this aim we have investigated candidate genes, gene-environment interactions, expression quantitative trait loci (eQTLs), genetic pleiotropy, and disease modifiers in ALS. We have shown an example of gene-environment interaction for the paraoxonase 1 gene and population density. We have identified anANG mutation in a ped- igree of familial ALS, of which one patient, additionally, showed signs of frontotemporal dementia (FTD) and parkinsonism. By the genetic mapping of gene expression we have implicated CYP27A1 as a susceptibility gene in ALS. We have found that UNC13A modifies survival in ALS patients and, additionally, is a shared risk locus for both ALS and FTD. We did not find any shared susceptibility loci for ALS and multiple sclerosis (MS). Also, we were not able to identify genetic modifiers that determined phenotypic heterogeneity in C9orf72 repeat expansion carriers. This thesis largely follows the developments of genetic research in ALS. Early genetic studies in ALS investigated pedigrees with familial ALS by using linkage studies, while sporadic ALS patients were studied by using a candidate gene approach based upon hypothesized pathogenic pathways. In 2007, the era of genome-wide association studies (GWAS) began in the search for genetic susceptibility factors in sporadic ALS. While these GWASs have implicated several new disease loci, most of them have proven difficult to replicate. More importantly, the results have been insufficient to explain the estimated heritability of approximately 60% for sporadic ALS.1 In this thesis, we have explored different methods to search for this ‘missing heritability’. By partly building on previous GWAS data and adding, for example, gene expres- sion profiles or GWAS data from neighboring diseases, the aim was to maximize the informative potential of these tremendous amounts of genetic data. Lastly, we looked into genetic disease modifiers, instead of studying susceptibility risk factors per se, as these may provide additional clues to pathogenic mechanisms in ALS or even targets for therapy. The results from the studies in this thesis are summarized and discussed in more detail.

CANDIDATE GENE APPROACHES In Chapter 2 we tested whether genetic variants in paraoxonase (PON) genes would interact with the envi- ronmental exposure to paraoxonase substrates (e.g. pesticides), which has been implicated as an environ- mental risk factor for ALS. We included 98 patients from a British population-based registry and obtained genotypes for four single nucleotide polymorphisms (SNPs) in paraoxonase genes that were previously associated with ALS. Subsequently, we tested for the interaction between SNP genotype and population density (as a proxy for rural versus urban regions, presuming differential exposures to PON substrates) in a case-only design. We found a significant gene-environment interaction for rs854560 (amino acid change L55M) in PON1. The minor allele (M) was more frequent in ALS patients with rural residence, suggesting that in the rural environment the MM genotype predisposes to ALS susceptibility. Also, the minor allele was associated with shorter survival. Although the study results draw a compelling conclusion, there are several

136 SUMMARY AND GENERAL DISCUSSION weaknesses that may be pointed out. The study size is rather small, possibly inducing a type I error. Further- more, population density may reflect many differences in environmental exposures besides the intended -ex posure to pesticides. Lastly, a case-only design only measures the gene-environment interaction, while the effects of gene or environment separately cannot be determined. Most notably, more recent publications on paraoxonase polymorphisms, including a large meta-analysis, failed to identify significant associations with sporadic ALS.2,3 The initially reported associations between paraoxonase and ALS might, therefore, have to be considered false positive findings.4-6 In view of this, the identification of a gene-environment interaction for the L55M polymorphism in PON1 in ALS patients may be of limited value. Of interest, there appears to be more evidence for an association between paraoxonase polymorphisms and Parkinson’s disease, although once again conflicting results have been reported.7 Chapter 3 describes a large pedigree of ALS patients and a patient with a complex phenotype including motor neuron signs, parkinsonism, and FTD symptoms. We screened a set of 39 unrelated fa- milial ALS index patients for ANG mutations and identified a (122A>T) mutation in one patient, leading to a K17I amino acid change in the ANG protein. Further investigation of 44 out of 62 family members of the proband’s pedigree (five affected individuals) revealed that all affected family members carried the K17I mutation. There were ten carriers without symptoms, of which 9 were under the age of 50 years. We also screened a set of 275 unrelated control samples, in which no K17I mutation was detected. We demonstrated segregation of the ANG K17I mutation with disease in a large pedigree. One family member with the K17I mutation showed a striking phenotype, with atypical parkinsonism at onset progressing to a combined ALS and FTD phenotype in the further course of disease. Interestingly, in a later mutation screen- ing for other familial ALS genes, four out of five affected family members appeared to also carry a N352S mutation in TARDBP.8 Even more interesting was the finding that the family member with a combined ALS, parkinsonism, and FTD phenotype carried the ANG K17I mutation without the TARDBP mutation. The TARDBP N352S mutation is considered to be pathogenic, although there may be incomplete penetrance.8 Later studies have also demonstrated ANG K17I mutations in control subjects. The K17I mutation, how- ever, may still be relevant to ALS pathogenesis as it is likely to disrupt normal protein function, including angiogenic and ribonucleolytic activity.9 Furthermore, a large study has demonstrated that ANG mutations not only form a risk factor for ALS, but also for Parkinson’s disease.10 The latter finding might explain the combined phenotype in the family member carrying the ANG mutation only. The co-segregation of ANG and TARDBP mutations in this family supports the possibility of an oligogenic basis for disease in ALS, as has

8 been described previously. 9 GENETIC MAPPING OF GENE EXPRESSION Gene expression levels can be mapped to genetic variation to form so-called expression quantitative trait loci or eQTLs. eQTLs can have local effects cis( ) or can have distant effects trans( ) across the genome. Trait-associated SNPs are more likely to be eQTLs.11 We, therefore, conducted a genome-wide mapping

137 of gene expression in order to detect novel disease-causing variants, that may not have been detected in traditional GWAS designs due to strict multiple-testing correction (Chapter 4). Using a two-stage design we determined eQTLs in a total of 323 sporadic ALS patients and 413 controls and we prioritized these eQTLs based on results from a two-stage GWAS, including a total of 3,568 ALS patients and 10,163 unaffected controls. Expression profiles were derived from peripheral whole blood tissue. Ultimate- ly, we identified onecis eQTL (eight SNP-mRNA transcript pairs) to be associated with ALS (preplication = 1.19×10-47) which explained 48-65% of the variance in gene expression. The SNP minor alleles were associated with an increased expression of CYP27A1, a gene involved in cholesterol metabolism, and mu- tations in which are known to cause a rare neurological disorder called cerebrotendinous xanthomatosis (CTX).12,13 CTX can present with upper motor neuron symptoms and is a known mimic for primary later- al sclerosis. In addition, cholesterol metabolism has previously been implicated in ALS pathogenesis.14-16 These findings would make CYP27A1 an interesting candidate gene for ALS. Additionally, we have been able to fine-map the chromosome 9p21.2 locus using eQTLs, confirming C9orf72 as the functionally relevant gene within this locus. In our data we confirmed that trait-associated SNPs were enriched for eQTLs. The study was designed to minimize the chance of false-positive associations by meticulous quality control, ex- cluding expression probes with non-specific mapping, permutation of SNP-transcript pair associations, and by using a two-stage approach. A drawback of the study is the use of whole blood tissue instead of neuronal tissue. The overlap of eQTLs between different tissue types appears to be modest. One study has shown a 21.8% overlap of cis eQTLs between B-cells and monocytes in blood.17 Similarly, a 22% overlap was found between two studies on brain tissue.18,19 Another study found that 28.7% of cis eQTLs were different across adipose tissue, muscle, liver and whole blood.20 Furthermore, trait-associated SNPs appear to be enriched for tissue-specific eQTLs.20 About 37-52% of genes mapped by cis eQTLs in human brain tissue studies were present in our data. While we considered peripheral whole blood as a valid starting point for the detection of eQTLs associated with ALS, we must consider the possibility that these might not translate to neuronal tissues. Or, conversely, we might not have picked up brain-specific eQTLs in peripheral blood. The two-stage design offered a valuable internal replication step, but may have attenuated statisti- cal power for the GWAS results that were included in the analysis. For example, the GWAS SNP p-value for the CYP27A1 eQTL was only just nominally significant (p = 0.049) in the discovery GWAS (2,261 ALS cases and 8,328 controls), although the association reached p = 1.32×10-4 in the replication GWAS (1,307 ALS cases and 1,835 controls). To date, no publications exist that have replicated the association of CYP27A1 with ALS, including the most recent and largest GWAS.21 Therefore, the role of CYP27A1 in the pathogenesis of ALS remains uncertain. Nevertheless, the study has proven that such an integrated approach forms a valuable method for prioritizing genetic variants from GWAS loci, given the results in the C9orf72 locus. Furthermore, the technique more directly assesses functional pathways involved in disease pathogenesis.22

138 SUMMARY AND GENERAL DISCUSSION

GENETIC PLEIOTROPY Genetic pleiotropy occurs when one gene may affect multiple traits, for which many examples exist in hu- man genetic disorders. In the search for missing the heritability in sporadic ALS we combined GWAS data from ALS patients with data from multiple sclerosis (MS) patients (Chapter 5) and with a GWAS of patients with frontotemporal dementia (Chapter 6). Because genetic risk factors may be shared across neighboring disorders, such undertakings may yield new insights into pathogenic mechanisms that are involved in both diseases.

ALS AND MS In Chapter 5, we collected GWAS data from 3,762 sporadic ALS patients and 4,886 controls and combined these data with 4,088 MS patients and 7,144 controls. After genome-wide SNP imputation, in order to increase comparability and coverage across study strata, we performed a genome-wide meta-analysis of both diseases. We could not replicate previously associated genetic variants from one disease in the other and vice versa. We further looked for association signals that would strengthen each other in the combined analysis. Only the HLA region on chromosome 6 reached genome-wide significance, and a suggestive association signal was found on chromosome 17 near NPEPPS (rs2935183, p < 5×10-7). However, both signals were driven by the MS data. Lastly, we looked for a shared polygenic contribution of multiple SNP markers collectively for both diseases, but failed to identify a subset of SNPs that explained genetic var- iance in both ALS and MS. In conclusion, we found no evidence for a shared genetic basis of common variants in ALS and MS. The study can be considered well powered for the detection of shared common risk variants in both diseases, although the possibility of a shared rare risk variant cannot be excluded. Pre- vious studies have reported a concurrence of ALS and MS, in particular in the presence of C9orf72 repeat expansions.23,24 However, this concurrence might be very well explained by referral bias or, in case of the Iranian study24, by population-specific effects. A survey of patients with both ALS and MS symptoms in a population-based study in The Netherlands did not show a higher than expected frequency of concurrence of ALS and MS.25 In patients with MS symptoms only, no C9orf72 repeat expansions have been identified.23,26 Based on aforementioned results, ALS and MS show little evidence for a shared genetic basis. There is, on the other hand, more convincing evidence for shared genetics between MS and inflammatory diseases.27 Therefore, although neurodegenerative processes may be involved in multiple sclerosis (either causative or consequential), they do not appear to be regulated by genes that are involved in the pathogenesis of ALS. 9 ALS AND FTD Subsequently, we looked for shared genetic risk loci between ALS and FTD (Chapter 6). We collected all available previous ALS GWAS data and combined these with a GWAS of pathology-proven FTD with TDP- 43 inclusions (FTD-TDP). A total of 4,377 ALS patients and 13,017 controls were combined with 435 FTD-TDP patients and 1,414 controls. We applied uniform quality control to the raw genotype data and

139 performed genome-wide SNP imputation using the HapMap3 reference panel to increase comparability and genome coverage. By using three complementary methods (a joint meta-analysis, replication of top results from one disease in the other, and a rank-products analysis allocating equal weight to ALS and FTD sample sizes) we demonstrated that not only C9orf72, but also UNC13A is a shared risk locus for ALS and FTD. Despite the relatively small size of the FTD-TDP cohort, we found a strong signal at rs12608932 in UNC13A (OR 1.46, p = 6.57×10-6) in the FTD-TDP cases. Similar to the C9orf72 locus, the association signal at UNC13A was greatly enhanced in a combined meta-analysis of ALS and FTD cohorts. Of course, one might argue that these association signals were mainly driven by the large ALS cohort in comparison to the relatively small FTD-TDP cohort. We, therefore, included two analyses that were independent of relative sample size and we demonstrated that there was a major contribution to the UNC13A signal from the FTD- TDP cohort. Additionally, we showed that association signals were not solely driven by FTD patients with motor neuron symptoms. Lastly, we looked into a suggestive association signal at chromosome 8q24.13 that emerged from the joint meta-analysis (lowest p-value = 3.91×10-7, OR 0.79). We replicated associa- tions with two SNPs in this locus in 4,056 ALS patients and 3,958 controls with nominal significance. The locus comprises gene KIAA0196 (alias SPG8) that codes for strumpellin, in which mutations are known to cause hereditary spastic paraplegia.28 However, the combined discovery and replication p-values for this locus did not reach the widely accepted threshold of genome-wide significance p( < 5×10-8). Therefore, additional independent replication, preferably in both ALS and FTD cohorts, will be required to definitively establish the association of the 8q24.13 locus in ALS and FTD. The results from Chapter 6 further support the hypothesis that ALS and FTD form parts of a spectrum of neurodegenerative disease.29 We have confirmed the important role of C9orf72 in the ALS-FTD spectrum. More importantly, our study has been the first to implicate UNC13A in the pathogenesis of FTD- TDP, further corroborating the role of this gene in neuronal degeneration. The role of UNC13A in ALS and FTD is discussed in more detail further in this chapter.

DISEASE MODIFIERS Another way to gain insight into disease pathways in ALS is to investigate modifiers of disease susceptibility or disease progression. Following the first GWAS in ALS that identifiedUNC13A as a risk factor for sporadic ALS we examined whether common variation in this gene might modify age at onset or survival in ALS (Chapter 7). We genotyped rs12608932 in UNC13A in a new population-based sample of 450 sporadic ALS patients and 524 unaffected control individuals and included individuals with age at onset and survival data from previous GWAS cohorts (1,767 ALS patients and 1,817 controls).30 In the newly genotyped co- hort we found a significant association of rs12608932 with ALS susceptibility (OR 1.91,p = 0.001), and we confirmed the association in the GWAS cohort. Additionally, in both the population-based cohort and the GWAS cohort we found a significant effect of UNC13A on survival in ALS patients, with the minor allele being associated with shorter survival. The difference in median survival between genotypic groups was

140 SUMMARY AND GENERAL DISCUSSION

10.0 months in the population-based cohort and 5.0 months in the GWAS cohort. We found no association with age at onset. This study highlights the need for population-based cohorts when investigating genetic susceptibility factors that modify survival, as referral-based prevalent cohorts may miss a proportion of short survivors, thus influencing allele frequencies in the sampled cohort (“frailty bias”).31-35 The use of referral-based, prevalent cohorts instead of population-based cohorts may, besides the lack of statistical power, explain why early replication attempts have failed to replicate an association between UNC13A and sporadic ALS.36,37 An Italian population-based study has confirmed the association ofUNC13A with survival in sporadic ALS. They found that homozygosity for the rare allele shortened ALS survival with approximately 12 months.38 Chapter 8 describes the results of a search for genetic modifiers that determine the onset of either ALS or FTD in C9orf72 repeat expansion carriers. Genome-wide SNP genotypes of ALS and FTD patients with a C9orf72 repeat expansion were collected from previous studies across multiple countries and we added a newly genotyped cohort of Dutch C9orf72 expansion carriers. Individuals were assigned a diagnosis group of ALS or FTD based on their predominant clinical symptom at disease onset. After quality control, there were 402 ALS patients and 253 FTD patients that we used for genome-wide SNP imputation using the 1000 Genomes reference panel. We then tested for association between SNP genotypes and the onset of either ALS or FTD. We found no genome-wide significant associations. Subsequently, we focused on common variants in TMEM106B and ATXN2 that have previously been implicated as disease modifiers in C9orf72 repeat expansion carriers.39,40 We were able to replicate the association with common variants in TMEM106B and the ALS or FTD phenotype in C9orf72 carriers with nominal significance (OR 0.79, p = 0.015) and, similarly, we found a nominally significant association for a SNP in ATXN2. Although the study includes the largest sample of ALS and FTD C9orf72 repeat expansion carriers with genome-wide data available, and all individuals shared the same genetic aberration, the lack of statistical power may still explain why we have not identified any novel ‘genetic switches’ that determined the onset of either ALS or FTD. Common variants in TMEM106B have been identified as a risk factor for FTD-TDP and cognitive symptoms in ALS.41 Our findings add to the existing evidence thatTMEM106B is a disease modifier inC9orf72 repeat expansion carriers.40,42 The TMEM106B protein plays an important role in lysosomal morphology and trafficking in neurons, and highlights this pathway in neurodegeneration.43,44

UNC13A One of the main findings in this thesis is the accumulating evidence forUNC13A in the pathogenesis of ALS. 9 The common variant rs12608932 in UNC13A on chromosome 19p13.11 was first identified in a large two-stage GWAS of sporadic ALS.30 The association has proven difficult to replicate in mostly small ALS cohorts, probably due to the small effect size and the use of prevalent, often referral-based cohorts (as has been explained in detail in Chapter 7).21,36,37 Another explanation may be population-specific effects.45 However, we have been able to replicate the association with disease susceptibility in a Dutch popula-

141 tion-based cohort of ALS and, most notably in an international dataset of pathology-proven FTD-TDP cases (Chapter 6 and 7). A forest plot of published association results for rs12608932 is shown in Figure 1, indicating an overall significant effect. Furthermore, we identifiedUNC13A as a modifier of survival in ALS (Chapter 7). This finding has now been replicated in two independent ALS cohorts, one of which consisted of C9orf72 repeat expansion carriers.38,46 The minor allele of rs12608932 was associated with up to 12 months shorter survival for the rare homozygote, which clearly is a clinically relevant effect. Therefore, UNC13A poses a very interesting therapeutic target, because intervention in the pathogenic mechanism of UNC13A may prolong survival of ALS patients. The rs12608932 SNP is located in an intron of the UNC13A gene and most likely tags rare causal variation in or near the UNC13A gene. However, Sanger sequencing of the exons of UNC13A has not been able to reveal a likely causal variant.47

Figure 1 Forest plot of published association results for rs12608932 in UNC13A in populations of European ancestry

Association results are given for allelic tests. OR = odds ratio. Meta-analysis results for a fixed effect model are shown. Box sizes are relative to stratum sample sizes. Horizontal lines indicate 95% confidence intervals.

The UNC13A protein is involved in priming presynaptic vesicles before their release into the synapse. Dis- ruption of normal UNC13A function may affect the release of neurotransmitters and, subsequently, relat- ed neuronal network activity.48 Also, aberrant release of (excitatory) neurotransmitters due to abnormal

142 SUMMARY AND GENERAL DISCUSSION

UNC13A function fits the previously described glutamate excitotoxicity hypothesis in ALS.49 Notably, the only drug with proven effect on disease progression, riluzole, is a glutamate inhibitor.50 Thus, the association with UNC13A implicates a role for synaptic function and neurotransmitter release as a converging mecha- nism in the pathogenesis of ALS and FTD.

FUTURE PERSPECTIVES AND CONCLUSIONS Following the great success stories of genome-wide association studies in common diseases such as type 2 diabetes and Crohn’s disease, the results from the first rush of GWASs in ALS have been slightly under- whelming. Most of the identified risk variants could not be confirmed in independent replication efforts. Possible explanations for the lack of consistent findings include the apparent heterogeneity of the ALS phenotype (e.g. variable age at onset, survival duration, site of onset), underlying population-specific ef- fects by using multiple international cohorts, and possible type I errors due to small study sizes. To date, the most significant result has been the association with C9orf72 in the chromosome 9p21.2 locus (which was known from linkage studies in ALS-FTD pedigrees). Despite increasing sample sizes of the ALS GWASs, the need for new analysis methods has arisen in order to account for the still largely missing heritability in ALS. In this thesis, we have explored several of those methods in the search for additional genetic susceptibility factors for ALS. By the genetic mapping of gene expression we have identifiedCYP27A1 as a risk locus for ALS, although there have been no follow-up studies to confirm this finding. The use of larger sample sizes for both GWAS and eQTL data sets may improve statistical power and robustness of results in future studies. A larger study size will, in addition, allow for the detection of small-effect sizecis and trans eQTLs. Further- more, with the emergence of next-generation sequencing techniques, expression profiling may be done using direct sequencing of mRNA transcripts (RNA-seq), instead of using the current micro-array platforms. RNA-seq may be able to identify transcripts with expression levels below the detection level of micro- arrays, and does not depend on a predefined set of probes, thus enabling the detection of unannotated transcripts.22 Obviously, gene expression profiling in neuronal tissue will be more informative for ALS than the use of peripheral blood tissue. However, sampling of brain tissue is not feasible during lifetime of ALS patients, and post-mortem material most likely shows end-stage disease expression profiles, which may not be representative for ALS pathogenesis. A better source tissue for the genetic mapping of expression may be motor neurons or other neuronal cell lineages derived from induced pluripotent stem cells from ALS patients. Furthermore, genomic data may be integrated with other level data, such as protein functioning, 9 methylation data, or metabolite levels. The accumulating evidence for UNC13A warrants further fine-mapping of the GWAS association signal. Exonic mutation screening has, however, not resulted in plausible causal variants.47 Therefore, the causal variant might lie in intronic sequence or, in view of the discovery of C9orf72, might represent another repeat expansion. There are up to 100 simple repeats in the UNC13A gene that usually are not well captured

143 by next-generation sequencing. Thus, a systematic, manual screening of each of these repeats would be required. Alternatively, the development of single-strand DNA sequencing techniques promises more accu- rate capturing of structural variation in genomic DNA. Recently, a GWAS with the largest number of sporadic ALS patients to date has been published, including 6,100 ALS cases and 7,125 controls, identifying a novel risk locus on chromosome 17q11.2.21 Also, GWASs in other traits like body mass index (BMI) including 339,224 individuals, or adult human height (253,288 individuals) have demonstrated that an ever increasing sample size may continue to yield additional associations.51,52 Larger sample sizes may, especially in view of a heterogeneous disease like sporadic ALS, continue to improve statistical power to detect associations. Furthermore, the field of complex genetic traits is shifting towards the study of rare variants, which may have larger effect sizes and further account for the missing heritability in sporadic ALS. Next-generation sequencing techniques make it possible to assess the presence of rare variants (MAF < 1%) across the genome. However, these techniques may be less suitable for the detection of structural variation, such as copy number variants or large repeat expansions. Whole-exome sequencing has already proven fruitful in familial ALS. By analyzing multiple pedigrees, several novel familial ALS genes have been identified, including VCP, HNRNPA1 and HNRNPA2B1, PFN1, MATR3, TUBA4A, CHCHD10, and most recently, TBK1.53,54 In the study of sporadic ALS, this would call for whole-exome or whole-genome association studies instead of the current genome-wide association studies, which still does not seem feasible because of the required vast sample size (think of tens of thousands of individuals), the high costs involved, and the massive computing power that would be needed. As an alternative, a newly developed genotyping array (the Illumina HumanExome Chip) containing over 250,000 rare variants obtained from whole-exome sequencing experiments may provide a cheaper way, to genotype genome-wide rare variants albeit less comprehensive.55 Another hybrid solution would be to perform whole-genome sequencing on a subset of ALS cases and create a custom reference panel, enriched for ALS-related variants, which may then be used for the imputation of rare variants in a large set of GWAS samples. This method has proven fruitful previously and, although still requiring large sample sizes, may reduce costs.56-58 As noted, in order to detect associations for rare variants, even greater sample sizes are required. Therefore, extensive international collaboration and data sharing becomes more important than ever. It is worth mentioning here that in 2013 the international Project MinE initiative (www.projectmine.com) has been started that raises funds for, ultimately, the sequencing of 15,000 ALS genomes and 7,500 control genomes. The world-famous ALS Ice Bucket Challenge has greatly boosted the awareness of ALS and may help to improve the funding for such undertakings. In conclusion, although ever improving genotyping techniques have greatly increased the number of known genetic causes for familial ALS, genetic studies have, thus far, only been able to explain a small part of sporadic ALS cases. This thesis has provided evidence for additional susceptibility loci in sporadic ALS, but it is clear that many of the genetic causes of ALS remain to be discovered. Increasing sample size,

144 SUMMARY AND GENERAL DISCUSSION studying rare variants and the integration with other level data may identify novel disease-causing genes. Perhaps even more importantly, the multiple genetic aberrations have to be linked to pathogenic mecha- nisms leading to motor neuron degeneration. The identification of such pathways may provide therapeutic targets enabling the development of an effective treatment for patients suffering from this devastating disease.

THE MAIN CONCLUSIONS OF THIS THESIS ARE: - ANG K17I mutations may modify penetrance and phenotype in a large pedigree of ALS and FTD, in the presence of a TARDBP N352S mutation (Chapter 3) - By genome-wide mapping of gene expression we have identified CYP27A1 as a susceptibility gene for sporadic ALS (Chapter 4) - Genome-wide eQTL mapping may provide a valuable approach to prioritize results from genome-wide association studies (Chapter 4) - There is no evidence for a shared genetic basis of common variants in ALS and multiple sclerosis (Chapter 5) - UNC13A is a shared risk locus for both ALS and frontotemporal dementia (Chapter 6) - UNC13A modifies survival in ALS patients (Chapter 7) - Population-based cohorts are preferred in study of genetic variants that modify survival (Chapter 7) - We confirmedTMEM106B as a disease modifier in C9orf72 repeat expansion carriers (Chapter 8)

9

145 REFERENCES 1. Al-Chalabi A, Fang F, Hanby MF, Leigh PN, Shaw CE, et al. An estimate of amyotrophic lateral sclerosis heritability using twin data. J Neurol Neurosurg Psychiatr 2010;81:1324-1326. 2. Lee YH, Kim J-H, Seo YH, Choi SJ, Ji JD, et al. Paraoxonase 1 Q192R and L55M polymorphisms and susceptibility to amyotrophic lateral sclerosis: a meta-analysis. Neurol Sci 2015;36:11-20. 3. Wills A-M, Cronin S, Slowik A, Kasperaviciute D, van Es MA, et al. A large-scale international meta-analysis of paraoxonase gene polymorphisms in sporadic ALS. Neurology 2009;73:16-24. 4. Saeed M, Siddique N, Hung WY, Usacheva E, Liu E, et al. Paraoxonase cluster polymorphisms are associated with sporadic ALS. Neurology 2006;67:771-776. 5. Slowik A, Tomik B, Wolkow PP, Partyka D, Turaj W, et al. Paraoxonase gene polymorphisms and sporadic ALS. Neurology 2006;67:766-770. 6. Cronin S, Greenway MJ, Prehn JHM, Hardiman O. Paraoxonase promoter and intronic variants modify risk of sporadic amyotrophic lateral sclerosis. J Neurol Neurosurg Psychiatr 2007;78:984- 986. 7. Menini T, Gugliucci A. Paraoxonase 1 in neurological disorders. Redox Rep 2014;19:49-58. 8. van Blitterswijk M, van Es MA, Hennekam EA, Dooijes D, van Rheenen W, et al. Evidence for an oligogenic basis of amyotrophic lateral sclerosis. Hum Mol Genet 2012;21:3776-3784. 9. Wu D, Yu W, Kishikawa H, Folkerth RD, Iafrate AJ, et al. Angiogenin loss-of-function mutations in amyotrophic lateral sclerosis. Ann Neurol 2007;62:609-617. 10. van Es MA, Schelhaas HJ, van Vught PWJ, Ticozzi N, Andersen PM, et al. Angiogenin variants in Parkinson disease and amyotrophic lateral sclerosis. Ann Neurol 2011;70:964-973. 11. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 2010;6:e1000888. 12. Guyant-Maréchal L, Verrips A, Girard C, Wevers RA, Zijlstra F, et al. Unusual cerebrotendinous xanthomatosis with fronto-temporal dementia phenotype. Am J Med Genet 2005;139A:114-117. 13. Gallus GN, Dotti MT, Federico A. Clinical and molecular diagnosis of cerebrotendinous xanthomatosis with a review of the mutations in the CYP27A1 gene. Neurol Sci 2006;27:143-149. 14. Chiò A, Calvo A, Ilardi A, Cavallo E, Moglia C, et al. Lower serum lipid levels are related to respiratory impairment in patients with ALS. Neurology 2009;73:1681-1685. 15. Dupuis L, Corcia P, Fergani A, Gonzalez De Aguilar J-L, Bonnefont-Rousselot D, et al. Dyslipidemia is a protective factor in amyotrophic lateral sclerosis. Neurology 2008;70:1004-1009. 16. Dupuis L, Pradat P-F, Ludolph AC, Loeffler J-P. Energy metabolism in amyotrophic lateral sclerosis. Lancet Neurol 2011;10:75-82. 17. Fairfax BP, Makino S, Radhakrishnan J, Plant K, Leslie S, et al. Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nat Genet 2012;44:502-510. 18. Webster JA, Gibbs JR, Clarke J, Ray M, Zhang W, et al. Genetic control of human brain transcript expression in Alzheimer disease. Am J Hum Genet 2009;84:445-458.

146 SUMMARY AND GENERAL DISCUSSION

19. Gibbs JR, van der Brug MP, Hernandez DG, Traynor BJ, Nalls MA, et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet 2010;6:e1000952. 20. Fu J, Wolfs MGM, Deelen P, Westra H-J, Fehrmann RSN, et al. Unraveling the regulatory mechanisms underlying tissue-dependent genetic variation of gene expression. PLoS Genet 2012;8:e1002431. 21. Fogh I, Ratti A, Gellera C, Lin K, Tiloca C, et al. A genome-wide association meta-analysis identifies a novel locus at 17q11.2 associated with sporadic amyotrophic lateral sclerosis.Hum Mol Genet 2014;23:2220-2231. 22. Westra H-J, Franke L. From genome to function by studying eQTLs. Biochim Biophys Acta 2014;1842:1896-1902. 23. Ismail A, Cooper-Knock J, Highley JR, Milano A, Kirby J, et al. Concurrence of multiple sclerosis and amyotrophic lateral sclerosis in patients with hexanucleotide repeat expansions of C9ORF72. J Neurol Neurosurg Psychiatr 2013;84:79-87. 24. Etemadifar M, Abtahi S-H, Akbari M, Maghzi A-H. Multiple sclerosis and amyotrophic lateral sclerosis: is there a link? Mult Scler 2012;18:902-904. 25. van Doormaal PTC, Gallo A, Van Rheenen W, Veldink JH, van Es MA, et al. Amyotrophic lateral sclerosis is not linked to multiple sclerosis in a population based study. J Neurol Neurosurg Psychiatr 2013;84:940-941. 26. Fenoglio C, De Riz M, Villa C, Serpente M, Ridolfi E, et al. C9ORF72 repeat expansion not detected in patients with multiple sclerosis. Neurobiol Aging 2014;35:1213.e11-e12. 27. Consortium IMSG, 2 WTCCC, Sawcer S, Hellenthal G, Pirinen M, et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 2011;476:214-219. 28. Valdmanis PN, Meijer IA, Reynolds A, Lei A, MacLeod P, et al. Mutations in the KIAA0196 gene at the SPG8 locus cause hereditary spastic paraplegia. Am J Hum Genet 2007;80:152-161. 29. Lattante S, Ciura S, Rouleau GA, Kabashi E. Defining the genetic connection linking amyotrophic lateral sclerosis (ALS) with frontotemporal dementia (FTD). Trends Genet 2015;31:263-273. 30. van Es MA, Veldink JH, Saris CGJ, Blauw HM, van Vught PWJ, et al. Genome-wide association study identifies 19p13.3 (UNC13A) and 9p21.2 as susceptibility loci for sporadic amyotrophic lateral sclerosis. Nat Genet 2009;41:1083-1087. 31. Chio A, Logroscino G, Hardiman O, Swingler R, Mitchell D, et al. Prognostic factors in ALS: A critical review. Amyotroph Lateral Scler 2009;10:310-323. 32. Landers JE, Melki J, Meininger V, Glass JD, van den Berg LH, et al. Reduced expression of the Kinesin-Associated Protein 3 (KIFAP3) gene increases survival in sporadic amyotrophic lateral 9 sclerosis. Proc Natl Acad Sci USA 2009;106:9004-9009. 33. Traynor BJ, Nalls M, Lai S-L, Gibbs RJ, Schymick JC, et al. Kinesin-associated protein 3 (KI FAP3) has no effect on survival in a population-based cohort of ALS patients.Proc Natl Acad Sci USA 2010;107:12335-12338. 34. van Doormaal PTC, Ticozzi N, Gellera C, Ratti A, Taroni F, et al. Analysis of the KIFAP3 gene in amyotrophic lateral sclerosis: a multicenter survival study. Neurobiol Aging 2014;35:2420.e2413-2424.

147 35. Sorenson EJ, Mandrekar J, Crum B, Stevens JC. Effect of referral bias on assessing survival in ALS. Neurology 2007;68:600-602. 36. Daoud H, Belzil V, Desjarlais A, Camu W, Dion PA, et al. Analysis of the UNC13A gene as a risk factor for sporadic amyotrophic lateral sclerosis. Arch Neurol 2010;67:516-517. 37. Shatunov A, Mok K, Newhouse S, Weale ME, Smith B, et al. Chromosome 9p21 in sporadic amyotrophic lateral sclerosis in the UK and seven other countries: a genome-wide association study. Lancet Neurol 2010;9:986-994. 38. Chiò A, Mora G, Restagno G, Brunetti M, Ossola I, et al. UNC13A influences survival in Italian amyo- trophic lateral sclerosis patients: a population-based study. Neurobiol Aging 2013;34:357.e351- 355. 39. Van Blitterswijk M, Mullen B, Heckman MG, Baker MC, Dejesus-Hernandez M, et al. Ataxin-2 as potential disease modifier in C9ORF72 expansion carriers.Neurobiol Aging 2014;35:2421. e2413-2421.e2417. 40. Van Blitterswijk M, Mullen B, Nicholson AM, Bieniek KF, Heckman MG, et al. TMEM106B protects C9ORF72 expansion carriers against frontotemporal dementia. Acta Neuropathol 2014;127:397- 406. 41. Van Deerlin VM, Sleiman PMA, Martinez-Lage M, Chen-Plotkin A, Wang L-S, et al. Common variants at 7p21 are associated with frontotemporal lobar degeneration with TDP-43 inclusions. Nat Genet 2010;42:234-239. 42. Gallagher MD, Suh E, Grossman M, Elman L, Mccluskey L, et al. TMEM106B is a genetic modifier of frontotemporal lobar degeneration with C9orf72 hexanucleotide repeat expansions. Acta Neuropathol 2014;127:407-418. 43. Debaisieux S, Schiavo G. TiME for TMEM106B. EMBO J 2014;33:405-406. 44. Stagi M, Klein ZA, Gould TJ, Bewersdorf J, Strittmatter SM. Lysosome size, motility and stress response regulated by fronto-temporal dementia modifier TMEM106B. Mol Cell Neurosci 2014;61:226-240. 45. Iida A, Takahashi A, Deng M, Zhang Y, Wang J, et al. Replication analysis of SNPs on 9p21.2 and 19p13.3 with amyotrophic lateral sclerosis in East Asians. Neurobiol Aging 2011;32:757.e13-e54. 46. van Blitterswijk M, Mullen B, Wojtas A, Heckman MG, Diehl NN, et al. Genetic modifiers in carriers of repeat expansions in the C9ORF72 gene. Mol Neurodegener 2014;9:38. 47. Koppers M, Groen EJN, van Vught PWJ, Van Rheenen W, Witteveen E, et al. Screening for rare variants in the coding region of ALS-associated genes at 9p21.2 and 19p13.3. Neurobiol Aging 2013;34:1518.e5-7. 48. Lavi A, Sheinin A, Shapira R, Zelmanoff D, Ashery U. DOC2B and Munc13-1 differentially regulate neuronal network activity. Cereb Cortex 2014;24:2309-2323. 49. Rothstein JD. Current hypotheses for the underlying biology of amyotrophic lateral sclerosis. Ann Neurol 2009;65 Suppl 1:S3-9. 50. Miller RG, Mitchell JD, Moore DH. Riluzole for amyotrophic lateral sclerosis (ALS)/motor neuron disease (MND). Cochrane Database Syst Rev 2012;3:CD001447.

148 SUMMARY AND GENERAL DISCUSSION

51. Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 2015;518:197-206. 52. Wood AR, Esko T, Yang J, Vedantam S, Pers TH, et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet 2014;46:1173-1186. 53. Marangi G, Traynor BJ. Genetic causes of amyotrophic lateral sclerosis: new genetic analysis meth odologies entailing new opportunities and challenges. Brain Res 2015;1607:75-93. 54. Cirulli ET, Lasseigne BN, Petrovski S, Sapp PC, Dion PA, et al. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways.Science 2015;347:1436-1441. 55. Wessel J, Chu AY, Willems SM, Wang S, Yaghootkar H, et al. Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat Commun 2015;6:5897. 56. Holm H, Gudbjartsson DF, Sulem P, Masson G, Helgadottir HT, et al. A rare variant in MYH6 is associated with high risk of sick sinus syndrome. Nat Genet 2011;43:316-320. 57. Panoutsopoulou K, Tachmazidou I, Zeggini E. In search of low-frequency and rare variants affecting complex traits. Hum Mol Genet 2013;22:R16-21. 58. Joshi PK, Prendergast J, Fraser RM, Huffman JE, Vitart V, et al. Local exome sequences facilitate imputation of less common variants and increase power of genome wide association studies. PLoS ONE 2013;8:e68604.

9

149

10

NEDERLANDSE SAMENVATTING (SUMMARY IN DUTCH) NEDERLANDSE SAMENVATTING (SUMMARY IN DUTCH) Amyotrofische laterale sclerose (ALS) is een dodelijke ziekte die zich uit in toenemende spierzwakte, dun- nere spieren (atrofie), spasticiteit en uiteindelijk ademhalingsspierzwakte. De verschijnselen worden ver- oorzaakt door het afsterven van zenuwcellen die spieren aansturen, die zich zowel in de hersenen als in het ruggenmerg bevinden. De symptomen beginnen vaak in één deel van het lichaam, denk bijvoorbeeld aan atrofie van de hand of onduidelijke spraak, en breiden zich vervolgens uit naar andere delen van het lichaam. Er is geen genezende behandeling voor ALS en patiënten overleden gemiddeld ongeveer drie jaar na het begin van de klachten. Er is één medicijn, riluzol, waarvan is aangetoond dat het het ziekteproces kan vertragen met ongeveer 3-6 maanden. Ongeveer 5-10% van de patiënten heeft familiaire ALS (waar- bij ALS in de familie voorkomt), terwijl de overige patiënten als sporadische ALS worden beschouwd. Bij familiaire ALS is er meestal een bepaald gendefect bekend dat de ziekte veroorzaakt, terwijl bij sporadische ALS patiënten waarschijnlijk een samenspel van meerdere omgevingsfactoren en genetische factoren tot de ziekte leidt. De ziekteverschijnselen van deze twee vormen zijn daarentegen meestal volledig hetzelfde. Er is veel onderzoek gedaan naar omgevingsfactoren en ALS, maar alleen voor roken lijkt er voldoende wetenschappelijk bewijs te zijn. Sinds de ontdekking van het eerste ALS gen (SOD1) in 1993 bij familiaire ALS patiënten zijn er steeds meer genen ontdekt die familiaire ALS kunnen veroorzaken. De belangrijkste genen die familiaire ALS veroorzaken in Nederland zijn C9orf72, TARDBP en FUS. In tegenstelling tot familiaire ALS zijn er bij sporadische ALS patiënten veel minder genetische oorzaken bekend. Een klein deel heeft genafwijkingen in de bekende familiaire ALS genen (ongeveer 6-15%). Naar de overige oorzaken wordt veel genetisch onderzoek gedaan. Aanvankelijk gebeurde dat vooral door te kijken naar kandidaatgenen (genen waarvan men bijvoorbeeld op basis van de functie een rol in ALS zou kunnen verwachten). Met de komst van nieuwe genetische onderzoekstechnieken werd het mogelijk om met behulp van ‘chips’ honderdduizenden geneti- sche varianten in het DNA in één experiment te onderzoeken. Dit bood de mogelijkheid om met genoom- wijde associatie studies (GWAS) te testen of veel voorkomende varianten in het DNA (zogenaamde single nucleotide polymorphisms of SNP’s) een verhoogd risico op ALS zouden geven. In een GWAS wordt dit onderzocht door de DNA varianten in een grote groep patiënten te vergelijken met grote aantallen gezonde controlepersonen. Op deze manier hoeft men niet meer alleen naar één kandidaatgen te kijken, maar kan men in een keer variaties verspreid over alle genen onderzoeken (hypotheseloos). Er zijn meerdere inter- nationale genoomwijde associatiestudies gedaan met ALS patiënten, waarbij onder andere associaties met het gen UNC13A op chromosoom 19 en met een gebied op chromosoom 9 werden gevonden. In 2010 werd de zeer belangrijke ontdekking gedaan dat een genafwijking in C9orf72 in het gebied op chromosoom 9 de oorzaak vormt voor ongeveer 37% van familiaire ALS patiënten en 6% van spora- dische patiënten. Bovendien blijkt deze genafwijking een belangrijke oorzaak te zijn voor het ontstaan van frontotemporale dementie (FTD). Frontotemporale dementie is een vorm van dementie die, in tegenstelling tot de ziekte van Alzheimer, niet zozeer gepaard gaat met geheugenverlies, maar vooral tot veranderingen

152 NEDERLANDSE SAMENVATTING (SUMMARY IN DUTCH) in het karakter (ontremming of initiatiefverlies) of taalfuncties leidt. In ongeveer 5-15% van de gevallen komen ALS en FTD samen voor. Dit proefschrift beschrijft verschillende onderzoeken die gericht zijn op het ontdekken van nieuwe genetische risicofactoren voor ALS. Er worden verschillende onderzoeksmethoden toegepast om dit doel te bereiken.

In Hoofdstuk 2 wordt gekeken of genetische variatie in het paraoxonase gen het risico op ALS beïnvloedt in gebieden met een hoge of juist lage bevolkingsdichtheid. Omdat een van de mogelijke omgevingsrisicofac- toren voor ALS blootstelling aan pesticiden zou kunnen zijn, is onderzocht of genen die het verwerken van pesticiden in het lichaam beïnvloeden een rol zouden kunnen spelen bij het ontstaan van ALS. Paraoxonase is een stof in het lichaam dat pesticiden afbreekt en in eerdere studies werden inderdaad associaties ge- vonden tussen paraoxonasegenen (PON) en ALS. In Hoofdstuk 2 hebben we variaties in paraoxonasegenen in 98 ALS uit Zuidoost Engeland onderzocht en tegelijkertijd op basis van postcode en geografische gege- vens bepaald of patiënten in een gebied met lage of hoge bevolkingsdichtheid woonden. Uit het onderzoek bleek dat ALS patiënten met een bepaalde variant van het PON1 gen een verhoogd risico hadden op ALS in het gebied met een lage bevolkingsdichtheid, terwijl dit niet het geval bleek te zijn in gebieden met een hoge bevolkingsdichtheid. Het onderzoek toonde een gen-omgeving interactie aan tussen variatie in PON1 en bevolkingsdichtheid.

Hoofdstuk 3 beschrijft een grote familie met familiaire ALS waarin een mutatie in angiogenine (ANG) over- erft met de ziekte. Deze zogenaamde K17I mutatie kwam voor bij alle aangedane familieleden, maar ook in enkele niet-aangedane familieleden, waarvan de meeste jonger waren dan 50 jaar en dus theoretisch nog ALS zouden kunnen ontwikkelen. Eén van de patiënten met de ANG K17I mutatie had een afwijkend klach- tenpatroon, beginnend met parkinsonisme, maar later ALS verschijnselen gecombineerd met gedragsver- anderingen passend bij FTD.

In Hoofdstuk 4 hebben we genoomwijde DNA variaties (SNP’s) gecombineerd met data van genexpressie. Genexpressie is een maat voor de activiteit van genen. In dit onderzoek hebben we gekeken naar DNA varianten die niet alleen mogelijk het risico op ALS vergrootten, maar ook een veranderde genactiviteit (of genexpressie) teweeg brachten in het bloed van ALS patiënten. Dergelijke varianten zijn waarschijnlijk belangrijker voor het ontstaan van een ziekte omdat ze ook functionele veranderingen in een lichaams- weefsel zoals bloed veroorzaken. Voor dit onderzoek hebben we genoomwijde SNP data van in totaal 3.568 sporadische ALS patiënten en 10.163 controles gebruikt en genexpressie data in bloed van 323 ALS patiënten en 413 controles verkregen. Door middel van een interne replicatiestap vonden we dat SNP’s in het gen CYP27A1 niet alleen geassocieerd zijn met ALS, maar ook zorgen voor een veranderde expressie van CYP27A1. CYP27A1 is een gen dat betrokken is bij de cholesterolstofwisseling en mutaties in 10 het gen kunnen een zeldzame neurologische ziekte cerebrotendineuze xanthomatose (CTX) veroorzaken.

153 CTX kan zich presenteren met neurologische verschijnselen die lijken op symptomen van ALS. Er zijn ook aanwijzingen in de literatuur dat vetstofwisseling een rol speelt bij ALS. Deze bevindingen maken CYP27A1 een interessante kandidaat voor een risicoverhogend gen bij ALS. Idealiter wordt onze bevinding bevestigd in een ander onafhankelijk onderzoek, maar dit is tot nu toe nog niet gebeurd.

In Hoofdstuk 5 en 6 wordt gekeken naar genetische overlap tussen ALS en twee andere neurologische ziektebeelden: multipele sclerose (MS) en frontotemporale dementie (FTD). Eventuele genetische overlap kan meer inzicht bieden in ziektemechanismen die in beide ziektebeelden een rol kan spelen. We hebben hiervoor ten eerste genoomwijde associatiestudies van ALS (3.762 patiënten en 4.886 controles) gecom- bineerd met die van 4.088 MS patiënten en 7.144 gezonde controles (Hoofdstuk 5). Na meta-analyse van de gegevens vonden we geen aanwijzingen voor gedeelde genetische risicofactoren tussen ALS en MS. De genetische risicofactoren voor MS werden niet gedragen door ALS en vice versa. In hoofdstuk 6 hebben we genoomwijde SNP data van in totaal 4.377 ALS patiënten en 13.017 controles gecombineerd met een GWAS van 435 FTD-patiënten (allen met na obductie bewezen FTD kenmerken in de hersenen) en 1.414 controles. In een meta-analyse zagen we dat door toevoeging van de FTD patiënten aan de ALS data, niet alleen het eerdere associatiesignaal op chromosoom 9 sterk toe- nam, maar ook het signaal van UNC13A op chromosoom 19 nam sterk toe. Vanwege het grote verschil in aantallen ALS patiënten en FTD patiënten in de studie, hebben we nog twee methoden toegepast, waar- mee we hebben laten zien dat de toegenomen associatiesignalen niet alleen door ALS, maar ook door FTD gedreven werden. Het onderzoek beschrijft voor het eerst een verband tussen genetische variatie in UNC13A en frontotemporale dementie. UNC13A is betrokken bij de regulatie van afgifte van signaalstoffen (neurotransmitters) tussen zenuwcellen en een verstoorde werking van UNC13A kan, door verandering van zenuwnetwerken in de hersenen, leiden tot schade aan zenuwcellen. Een van deze veranderingen kan zijn dat de neurotransmitter glutamaat verhoogd wordt afgegeven en door ‘overprikkeling’ schade kan aan- richten aan zenuwcellen in de hersenschors (excitotoxiciteit). Deze overprikkeling door glutamaat is een mechanisme waarvoor reeds aanwijzingen bestaan in eerdere literatuur. Verder is hier vermeldenswaardig dat riluzol, het enige medicijn met bewezen effect bij ALS, een glutamaatremmer is. Het is echter niet be- kend hoe riluzol het beloop van ALS vertraagt.

In Hoofdstuk 7 wordt de rol van UNC13A in ALS verder onderzocht door te kijken naar een mogelijke relatie met debuutleeftijd van ALS symptomen of een effect op overleving. Hiervoor hebben we 450 ALS patiënten en 524 controlepersonen geselecteerd uit een prospectief cohortonderzoek naar ALS en een genetische variant in UNC13A onderzocht. Een SNP in UNC13A bleek geassocieerd met ALS in dit nieuwe cohort. Bo- vendien hebben we aangetoond dat genetische variatie in UNC13A geassocieerd is met de overlevingsduur van ALS patiënten. Deze laatste bevinding bleek ook te gelden voor in totaal 1.767 ALS patiënten en 1.817 controles uit Nederland, België en Zweden die aan een eerder gepubliceerd GWAS hadden meegedaan. We

154 NEDERLANDSE SAMENVATTING (SUMMARY IN DUTCH) vonden geen relatie tussen UNC13A en de debuutleeftijd van ALS symptomen. De bevinding dat UNC13A is geassocieerd met overleving bij ALS patiënten is bevestigd in een Italiaans onderzoek en kan ook een goed aangrijpingspunt vormen voor een eventuele behandeling. Hiervoor zal er echter eerst meer inzicht moeten worden verkregen in de rol die UNC13A speelt in het ziektemechanisme van ALS.

Een genetische afwijking in het gen C9orf72 kan zowel ALS als FTD veroorzaken en is een veelvoorkomen- de oorzaak bij beide ziektebeelden. De functie van C9orf72 in het zenuwstelsel is vooralsnog onbekend. In Hoofdstuk 8 is onderzocht of er bij dragers van de genafwijking in C9orf72 andere genetische varianten zijn die bepalen of iemand vooral ALS symptomen of juist verschijnselen van FTD krijgt. We hebben hiervoor zoveel mogelijk ALS en FTD patiënten met een genafwijking in C9orf72 verzameld van wie ook GWAS data beschikbaar was. Door een wereldwijde samenwerking hebben we 402 ALS patiënten en 253 FTD patiën- ten met een C9orf72 genafwijking en genoomwijde SNP data kunnen verzamelen. We hebben hierbij geen nieuwe varianten gevonden die bepalen of een persoon ALS of juist FTD verschijnselen ontwikkelt. Eerdere onderzoeken bij dragers van de C9orf72 genafwijking hebben door kandidaatgenen te bekijken gevonden dat het gen TMEM106B en een variant in ataxine-2 (ATXN2) het ontstaan van ALS of FTD kan beïnvloeden. Hiervoor vonden wij ook aanwijzingen in onze data, alhoewel deze effecten minder sterk lijken. In de toe- komst kan, door verdere internationale samenwerking, een grotere groep ALS en FTD patiënten met een C9orf72 genafwijking worden verzameld en wordt de kans vergroot dat er genetische varianten gevonden kunnen worden die het optreden van ALS of FTD kunnen beïnvloeden.

In Hoofdstuk 9 worden de bevindingen van de onderzoeken in dit proefschrift samengevat en in een bredere context geplaatst.

10

155

DANKWOORD DANKWOORD

Promoveren doe je niet alleen en ik wil dan ook iedereen danken die mij heeft ondersteund bij het tot stand laten komen van dit proefschrift.

Allereerst gaat mijn dank uit naar mijn promotoren prof. dr. L.H. van den Berg en prof. dr. J.H. Veldink. Beste Leonard, het is een voorrecht geweest om onder jouw hoede te kunnen promoveren. Je hebt alle vrijheid gegeven en de gelegenheid geboden om onderzoek te doen op wereldniveau. Je enthousiasme, creativiteit en bovenal humor zijn een enorme inspiratiebron geweest en met de gevleugelde woorden “is het al af?” heb je me steeds weer weten te stimuleren. Ik heb bewonderd hoe je altijd de prioriteiten en de hoofdlijn binnen een onderzoeksproject haarscherp voor ogen houdt, wat de kwaliteit van de artikelen sterk ten goede kwam. Beste Jan, je begeleiding is enorm inspirerend en bovenal leerzaam geweest. Je indrukwekkende statisti- sche kennis heeft me enorm geholpen om complexe bergen aan data succesvol te analyseren. Ik ben altijd onder de indruk geweest van het complexity distortion field waarmee je een hidden Markov model weet uit te leggen en de modelrol die je vervult tijdens de ALS congressen. Tenslotte heb je me door je onuitput- telijke bron van ideeën laten inzien dat de uitspraak “komt toch niets uit…” nooit zal bijdragen aan nieuwe baanbrekende onderzoeksresultaten.

Ik wil de leden van de beoordelingscommissie, prof. dr. P.I.W. de Bakker, prof. dr. L. Franke, prof. dr. E.M. Hol, prof. dr. J.K. Ploos van Amstel, prof. dr. G.J.E. Rinkel, prof. dr. J.C. van Swieten, danken voor het verdiepen in mijn proefschrift en de gunstige beoordeling.

Mijn promotietraject is doorvlochten geweest met klinische stages in het kader van mijn opleiding tot neuroloog. Ik wil mijn opleiders, prof. dr. J.H.J. Wokke en dr. T. Seute danken voor de mooie neurologie op- leiding die ik in het UMC Utrecht mag genieten naast mijn promotie onderzoek. Ik kijk uit naar de komende jaren in de kliniek.

Beste collega’s van de Neurologie, bedankt voor alle gezelligheid en collegialiteit die onze assistentengroep biedt, zowel op de werkvloer als daarbuiten tijdens de borrels, assistentenweekenden en natuurlijk de Ba- binski!

Mijn dank gaat uit naar alle medewerkers van het ALS centrum die het onderzoek in mijn proefschrift mogelijk hebben gemaakt. In het bijzonder wil ik Nienke en Inge bedanken dankzij wie de polikliniekdagen soepel verlopen en die een houvast vormen voor patiënten met ALS.

158 DANKWOORD

Ik wil de co-auteurs danken voor hun waardevolle wetenschappelijke input tijdens de onderzoeksprojecten en bij het schrijven van de artikelen. Prof. dr. J. Pasterkamp, Jeroen, dank je voor je functioneel biologische aanvullingen op mijn manuscripten en input tijdens labbesprekingen. I would like to thank all co-authors for their valuable input during my research projects and the writing of the article manuscripts. I would like to thank Prof. Ammar Al-Chalabi from the MRC Centre for Neurodege- neration Research at King’s College London for the opportunity I had to do my 6-month research project as a visiting student and the fruitful collaboration we had in further joint research projects.

Beste Peter, Raymond en Jelena, dank voor jullie onmisbare hulp in het lab. Van het optimaliseren van Taq- Mans tot het immer uitbreiden van de hoeveelheid DNA platen, jullie hulp is onmisbaar geweest!

Alle labgenoten van het Lab Experimentele Neurologie, Wouter, Perry, Oliver, Ewout, Max, Gijs, Lotte, Mei- nie, Annelot, Anna, Dianne, Marc, Sandra, ik heb genoten van onze bijzonder hechte groep. Wouter, Perry, Oliver, dank voor de scripting tips, veronthelderende discussies, Scootersessies, het op de kaart zetten van Roosendaal en het verkennen van Dublin en Sheffield by night. Ewout, dank voor het bewaken van de 10.00u – 11.30u – 15.00u momenten, ik neem aan dat dit in Edinborough ook lukt? Anna, je wortel-walno- tentaart is werkelijk onovertroffen. Paul, Michael, Hylke en Christiaan, a.k.a. de “ALS boys”, jullie hebben de toon gezet voor een hechte en gezellige onderzoeksgroep en hebben een voorbeeldfunctie gevormd voor het doen van kwalitatief hoogwaardig onderzoek.

Ik wil ook alle andere neuromusculaire onderzoekers, Sonja, Nadia, Renske, Marloes, Renée, Henk-Jan, Mark, Anne, Camiel, Bas danken voor de gezellige koffiemomenten, al denk ik dat echte mannen dan ook iets meer koffie mogen drinken.

Studenten Peter en Femke, dank voor jullie hulp bij DNA pipetteren, uitplaten, ophalen in Leuven of weg- brengen naar Rotterdam! Die OV-jaarkaart is er goed uitgehaald!

Mijn paranimfen, Wouter Boer en Henk-Jan Prins, jullie hebben mijn onderzoeksperikelen kunnen volgen vanaf het eerste gelletje trekken in mijn studententijd tot aan het afronden van mijn promotie. Dank jullie voor het aan mijn zijde staan tijdens de verdediging! En, Henk-Jan, succes met de afronding van je eigen proefschrift.

Familie en vrienden, Reina, Anne Fleur, bedankt voor het geduldig aanhoren van mijn onbegrijpelijke uitleg over snips, geewassen, unken en orfen, het begrip voor momenten dat een onderzoeksdeadline even voor ging, maar ook alle welkome gezellige afleiding.

159 Mijn lieve zusje, Meta, ik bewonder je humor en doorzettingsvermogen en ik had nooit gedacht dat je on- danks al mijn onderzoeksfrustraties later zou eindigen in een geneticalab voor je eigen promotie onderzoek, maar ik ben maar wat trots! Wellicht is het iets wat dominant overerft… Veel succes met de laatste loodjes van je onderzoek en uiteraard met Mr. Darcy!

Lieve pap en mam, zonder jullie was ik nooit zover gekomen. Bedankt voor jullie onvoorwaardelijke steun en geduld, zelfs in het afgelopen jaar. Pap, ik ben zo blij dat je erbij kunt zijn!

Mijn liefste Fre, het leven van de promovendus bestaat niet alleen uit rozen en ik dank je voor je steun en het geduld dat je hebt gehad in de afgelopen onderzoeksjaren, vooral wanneer er tot laat aan een manuscript gewerkt werd. Dank je bovenal voor alles waarvan we samen zo kunnen genieten, ik hoop dat daar nog veel meer van gaat komen!

Frank.

160 DANKWOORD

161

LIST OF PUBLICATIONS LIST OF PUBLICATIONS

THIS THESIS van Es MA, Diekstra FP, Veldink JH, Baas F, Bourque PR, Schelhaas HJ, Strengman E, Hennekam EAM, Lindhout D, Ophoff RA, van den Berg LH. A case of ALS-FTD in a large FALS pedigree with a K17I ANG mutation. Neurology. 2009;72:287–288.

Diekstra FP, Beleza-Meireles A, Leigh NP, Shaw CE, Al-Chalabi A. Interaction between PON1 and population density in amyotrophic lateral sclerosis. NeuroReport. 2009;20:186–190.

Diekstra FP, Saris CGJ, Van Rheenen W, Franke L, Jansen RC, van Es MA, van Vught PWJ, Blauw HM, Groen EJN, Horvath S, Estrada K, Rivadeneira F, Hofman A, Uitterlinden AG, Robberecht W, Andersen PM, Melki J, Meininger V, Hardiman O, Landers JE, Brown RH, Shatunov A, Shaw CE, Leigh PN, Al-Chalabi A, Ophoff RA, van den Berg LH, Veldink JH. Mapping of gene expression reveals CYP27A1 as a susceptibility gene for sporadic ALS. PLoS ONE. 2012;7:e35333.

Diekstra FP, van Vught PWJ, Van Rheenen W, Koppers M, Pasterkamp RJ, van Es MA, Schelhaas HJ, de Visser M, Robberecht W, Van Damme P, Andersen PM, van den Berg LH, Veldink JH. UNC13A is a modifier of survival in amyotrophic lateral sclerosis. Neurobiol Aging. 2012;33:630.e3–8.

Goris A, van Setten J, Diekstra F, Ripke S, Patsopoulos NA, Sawcer SJ, International Multiple Sclerosis Ge- netics Consortium, van Es M, Australia and New Zealand MS Genetics Consortium, Andersen PM, Melki J, Meininger V, Hardiman O, Landers JE, Brown RH, Shatunov A, Leigh N, Al-Chalabi A, Shaw CE, Traynor BJ, Chiò A, Restagno G, Mora G, Ophoff RA, Oksenberg JR, Van Damme P, Compston A, Robberecht W, Dubois B, van den Berg LH, de Jager PL, Veldink JH, de Bakker PIW. No evidence for shared genetic basis of com- mon variants in multiple sclerosis and amyotrophic lateral sclerosis. Hum Mol Genet. 2014;23:1916–1922.

Diekstra FP, Van Deerlin VM, van Swieten JC, Al-Chalabi A, Ludolph AC, Weishaupt JH, Hardiman O, Landers JE, Brown RH, van Es MA, Pasterkamp RJ, Koppers M, Andersen PM, Estrada K, Rivadeneira F, Hofman A, Uitterlinden AG, Van Damme P, Melki J, Meininger V, Shatunov A, Shaw CE, Leigh PN, Shaw PJ, Morrison KE, Fogh I, Chiò A, Traynor BJ, Czell D, Weber M, Heutink P, de Bakker PIW, Silani V, Robberecht W, van den Berg LH, Veldink JH. C9orf72 and UNC13A are shared risk loci for amyotrophic lateral sclerosis and frontotem- poral dementia: a genome-wide meta-analysis. Ann Neurol. 2014;76:120–133.

164 LIST OF PUBLICATIONS

Diekstra FP, Nalls MA, Van Deerlin VM, Ferrari R, van Es MA, van Swieten JC, Heutink P Shatunov A, Al-Cha- labi A, Hardiman O, Shaw PJ, Morrison KE, van Damme P, Robberecht W van der Zee J, van Broekhoven C, Brice A, Le Ber I, Graff C, Pickering-Brown S, Chiò A, Singleton AB, Traynor BJ, Hardy J, Rademakers R, van den Berg LH, Veldink JH. Genetic modifiers in C9orf72 repeat expansion carriers: a genome-wide analysis. Manuscript in preparation.

OTHER PUBLICATIONS Landers J, Shi L, Cho T, Glass J, Shaw C, Nigel Leigh P, Diekstra F, Polak M, Rodriguez-Leyva I, Niemann S, Traynor B, McKenna-Yasek D, Sapp P, Al-Chalabi A, Wills A, Brown R. A common haplotype within the PON1 promoter region is associated with sporadic ALS. Amyotroph Lateral Scler. 2008;10:1–9.

Landers JE, Melki J, Meininger V, Glass JD, van den Berg LH, van Es MA, Sapp PC, van Vught PWJ, McK- enna-Yasek DM, Blauw HM, Cho T-J, Polak M, Shi L, Wills A-M, Broom WJ, Ticozzi N, Silani V, Ozoguz A, Rodriguez-Leyva I, Veldink JH, Ivinson AJ, Saris CGJ, Hosler BA, Barnes-Nessa A, Couture N, Wokke JHJ, Kwiatkowski TJ, Ophoff RA, Cronin S, Hardiman O, Diekstra FP, Nigel Leigh P, Shaw CE, Simpson CL, Hansen VK, Powell JF, Corcia P, Salachas F, Heath S, Galan P, Georges F, Horvitz HR, Lathrop M, Purcell S, Al-Chalabi A, Brown RH. Reduced expression of the Kinesin-Associated Protein 3 (KIFAP3) gene increases survival in sporadic amyotrophic lateral sclerosis. Proc Natl Acad Sci USA. 2009;106:9004–9009.

Blauw HM, Al-Chalabi A, Andersen PM, van Vught PWJ, Diekstra FP, van Es MA, Saris CGJ, Groen EJN, Van Rheenen W, Koppers M, van’t Slot R, Strengman E, Estrada K, Rivadeneira F, Hofman A, Uitterlinden AG, Kiemeney LA, Vermeulen SHM, Birve A, Waibel S, Meyer T, Cronin S, McLaughlin RL, Hardiman O, Sapp PC, Tobin MD, Wain LV, Tomik B, Slowik A, Lemmens R, Rujescu D, Schulte C, Gasser T, Brown RH, Landers JE, Robberecht W, Ludolph AC, Ophoff RA, Veldink JH, van den Berg LH. A large genome scan for rare CNVs in amyotrophic lateral sclerosis. Hum Mol Genet. 2010;19:4091–4099. van Es MA, Schelhaas HJ, van Vught PWJ, Ticozzi N, Andersen PM, Groen EJN, Schulte C, Blauw HM, Kop- pers M, Diekstra FP, Fumoto K, LeClerc AL, Keagle P, Bloem BR, Scheffer H, van Nuenen BFL, Van Blitterswijk M, Van Rheenen W, Wills A-M, Lowe PP, Hu G-F, Yu W, Kishikawa H, Wu D, Folkerth RD, Mariani C, Goldwurm S, Pezzoli G, Van Damme P, Lemmens R, Dahlberg C, Birve A, Fernández-Santiago R, Waibel S, Klein C, Weber M, Van Der Kooi AJ, de Visser M, Verbaan D, van Hilten JJ, Heutink P, Hennekam EAM, Cuppen E, Berg D, Brown RH, Silani V, Gasser T, Ludolph AC, Robberecht W, Ophoff RA, Veldink JH, Pasterkamp RJ, de Bakker PIW, Landers JE, van de Warrenburg BP, van den Berg LH. Angiogenin variants in Parkinson disease and amyotrophic lateral sclerosis. Ann Neurol. 2011;70:964–973.

165 Groen EJN, Van Rheenen W, Koppers M, van Doormaal PTC, Vlam L, Diekstra FP, Dooijes D, Pasterkamp RJ, van den Berg LH, Veldink JH. CGG-repeat expansion in FMR1 is not associated with amyotrophic lateral sclerosis. Neurobiol Aging. 2012;33:1852.e1–e3. Ahmeti KB, Ajroud-Driss S, Al-Chalabi A, Andersen PM, Armstrong J, Birve A, Blauw HM, Brown RH, Brui- jn L, Chen W, Chiò A, Comeau MC, Cronin S, Diekstra FP, Soraya Gkazi A, Glass JD, Grab JD, Groen EJ, Haines JL, Hardiman O, Heller S, Huang J, Hung W-Y, ITALSGEN Consortium, Jaworski JM, Jones A, Khan H, Landers JE, Langefeld CD, Leigh PN, Marion MC, McLaughlin RL, Meininger V, Melki J, Miller JW, Mora G, Pericak-Vance MA, Rampersaud E, Robberecht W, Russell LP, Salachas F, Saris CG, Shatunov A, Shaw CE, Siddique N, Siddique T, Smith BN, Sufit R, Topp S, Traynor BJ, Vance C, Van Damme P, van den Berg LH, van Es MA, Van Vught PW, Veldink JH, Yang Y, Zheng JG, ALSGEN Consortium. Age of onset of amyotrophic lateral sclerosis is modulated by a locus on 1p34.1. Neurobiol Aging. 2013;34:357.e7–19.

Van Rheenen W, Diekstra FP, van Doormaal PTC, Seelen M, Kenna K, McLaughlin R, Shatunov A, Czell D, van Es MA, van Vught PWJ, Van Damme P, Smith BN, Waibel S, Schelhaas HJ, Van Der Kooi AJ, de Visser M, Weber M, Robberecht W, Hardiman O, Shaw PJ, Shaw CE, Morrison KE, Al-Chalabi A, Andersen PM, Ludolph AC, Veldink JH, van den Berg LH. H63D polymorphism in HFE is not associated with amyotrophic lateral sclerosis. Neurobiol Aging. 2013;34:1517.e5–e7.

Fogh I, Ratti A, Gellera C, Lin K, Tiloca C, Moskvina V, Corrado L, Sorarù G, Cereda C, Corti S, Gentilini D, Calini D, Castellotti B, Mazzini L, Querin G, Gagliardi S, Del Bo R, Conforti FL, Siciliano G, Inghilleri M, Saccà F, Bongioanni P, Penco S, Corbo M, Sorbi S, Filosto M, Ferlini A, Di Blasio AM, Signorini S, Shatunov A, Jones A, Shaw PJ, Morrison KE, Farmer AE, Van Damme P, Robberecht W, Chiò A, Traynor BJ, Sendtner M, Melki J, Meininger V, Hardiman O, Andersen PM, Leigh NP, Glass JD, Overste D, Diekstra FP, Veldink JH, van Es MA, Shaw CE, Weale ME, Lewis CM, Williams J, Brown RH, Landers JE, Ticozzi N, Ceroni M, Pegoraro E, Comi GP, D’Alfonso S, van den Berg LH, Taroni F, Al-Chalabi A, Powell J, Silani V, SLAGEN Consortium and Collaborators. A genome-wide association meta-analysis identifies a novel locus at 17q11.2 associated with sporadic amyotrophic lateral sclerosis. Hum Mol Genet. 2014;23:2220–2231.

Smith BN, Ticozzi N, Fallini C, Gkazi AS, Topp S, Kenna KP, Scotter EL, Kost J, Keagle P, Miller JW, Calini D, Vance C, Danielson EW, Troakes C, Tiloca C, Al-Sarraj S, Lewis EA, King A, Colombrita C, Pensato V, Cas- tellotti B, de Belleroche J, Baas F, Asbroek ten ALMA, Sapp PC, McKenna-Yasek D, McLaughlin RL, Polak M, Asress S, Esteban-Pérez J, Muñoz-Blanco JL, Simpson M, SLAGEN Consortium, Van Rheenen W, Diekstra FP, Lauria G, Duga S, Corti S, Cereda C, Corrado L, Sorarù G, Morrison KE, Williams KL, Nicholson GA, Blair IP, Dion PA, Leblond CS, Rouleau GA, Hardiman O, Veldink JH, van den Berg LH, Al-Chalabi A, Pall H, Shaw PJ, Turner MR, Talbot K, Taroni F, García-Redondo A, Wu Z, Glass JD, Gellera C, Ratti A, Brown RH, Silani V, Shaw CE, Landers JE. Exome-wide rare variant analysis identifies TUBA4A mutations associated with familial ALS. Neuron. 2014;84:324–331.

166 LIST OF PUBLICATIONS

Van Rheenen W, Diekstra FP, van den Berg LH, Veldink JH. Are CHCHD10 mutations indeed associated with familial amyotrophic lateral sclerosis? Brain. 2014;137:e313. Cats EA, van der Pol W-L, Tio-Gillen AP, Diekstra FP, van den Berg LH, Jacobs BC. Clonality of anti-GM1 IgM antibodies in multifocal motor neuropathy and the Guillain-Barré syndrome. J Neurol Neurosurg Psychiatr. 2015;86:502–504.

Kremer PHC, Koeleman BPC, Rinkel GJE, Diekstra FP, van den Berg LH, Veldink JH, Klijn CJM. Susceptibility loci for sporadic brain arteriovenous malformation; a replication study and meta-analysis. J Neurol Neurosurg Psychiatr. Accepted for publication.

SUBMITTED De Muynck L, Diekstra FP, Borroni B, Medic J, Thijs V, Camuzat A, Van Den Bosch L, van den Berg LH, Robberecht W, Le Ber I, Ravits J, Veldink JH, Van Damme P. C9orf72 expression levels modify survival in sporadic and familial ALS. Submitted. van Doormaal PTC, Diekstra FP, van den Heuvel DMA, van Rheenen W, Overste D, Dekker AM, Schellevis RD, van Damme P, de Bakker PIW, Francioli LC, Pasterkamp RJ, van den Berg LH, Veldink JH. The role of de novo mutations in the development of sporadic amyotrophic lateral sclerosis. Submitted. van Rheenen W, Shatunov A, Dekker AM, McLaughlin RL, Diekstra FP, Pulit SL, de Jong S, Andres CR, van Doormaal PTC, Tazelaar GH, Koppers M, Blokhuis AM, Sproviero W, Jones A, Kenna KP, van Eijk KR, Harschnitz O, Robinson MR, Vosa U, Medic J, Schellevis R, Brands W, Menelaou A, Rogelj B, Millechamps S, de Carvalho M, Mora JS, Rojas-García R, Chandran S, Colville S, Morrison K, Shaw PJ, Hardy J, Orrell RW, Petri S, Sendtner M, Meyer T, Staats KA, Ophoff RA, Van Deerlin VM, Basak N, Parman Y, Uitterlinden AG, Rivadeneira F, Estrada K, Hofman A, Curtis C, Blauw HM, de Visser M, van der Kooi AJ, Goris A, Weber M, Shaw CE, Smith BN, Fogh I, Silani V, Powell J, SLAGEN consortium, FALS sequencing consortium, Casale F, Chio A, Beghi E, Pupillo E, Logroscino G, Yang J, Wray NR, Visscher P, Franke L, Ludolph AC, Weishaupt J, Robberecht W, van Damme P, Brown Jr RH, Landers JE, Hardiman O, Andersen PM, Corcia P, Vourch P, de Bakker PIW, Pasterkamp JR, van Es MA, Lewis C, Breen G, Al-Chalabi A, van den Berg LH, Veldink JH. Novel risk variants and genetic architecture in amyotrophic lateral sclerosis. Submitted.

167

CURRICULUM VITAE CURRICULUM VITAE Frank Paul Diekstra werd geboren op 4 augustus 1983 te Nijmegen. Hij behaalde zijn VWO-diploma aan het Stedelijk Gymnasium te Nijmegen in 2001. In hetzelfde jaar begon hij aan de studie Geneeskunde aan de Universiteit Utrecht. Tijdens zijn studie deed hij wetenschappelijk onderzoek bij het laboratorium voor experimentele Neurologie in het UMC Utrecht onder begeleiding van dr. Michael van Es en prof. dr. Leonard van den Berg. In laatste jaar van zijn studie deed hij een half jaar onderzoek aan het MRC Centre for Neuro- degeneration Research van het Institute of Psychiatry van het King’s College in Londen onder begeleiding van prof. Ammar Al-Chalabi. Na het afleggen van zijn artsexamen in de zomer van 2008 werd hij in 2009 aangenomen voor de opleiding Neurologie in het UMC Utrecht. Hij keerde terug naar het laboratorium experimentele Neurologie voor zijn promotieonderzoek naar genetische risicofactoren voor ALS onder su- pervisie van prof. Leonard van den Berg en prof. dr. Jan Veldink, waarvan de resultaten zijn beschreven in dit proefschrift.

170 CURRICULUM VITAE

171