Genome-Wide Association Study Meta-Analysis of European
Total Page:16
File Type:pdf, Size:1020Kb
Molecular Psychiatry (2013) 18, 195–205 & 2013 Macmillan Publishers Limited All rights reserved 1359-4184/13 www.nature.com/mp ORIGINAL ARTICLE Genome-wide association study meta-analysis of European and Asian-ancestry samples identifies three novel loci associated with bipolar disorder DT Chen1, X Jiang1, N Akula1, YY Shugart1, JR Wendland1, CJM Steele1, L Kassem1, J-H Park2, N Chatterjee2, S Jamain3, A Cheng4, M Leboyer3, P Muglia5, TG Schulze1,6, S Cichon7,MMNo¨then7, M Rietschel8, BiGS9 and FJ McMahon1 1Human Genetics Branch, National Institute of Mental Health, Intramural Research Program, National Institutes of Health, US Department of Health and Human Services, Bethesda, MA, USA; 2Division of Cancer Epidemiology and Genetics, NCI, NIH, DHHS, Rockville, MA, USA; 3Inserm U955, Department of Psychiatry, Groupe Hospitalier Henri Mondor-Albert Chenevier, AP-HP, Universite´ Paris Est, Fondation FondaMental, Cre´teil, France; 4Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan; 5Department of Psychiatry, University of Toronto, Toronto, ON, Canada; 6Section on Psychiatric Genetics, Department of Psychiatry and Psychotherapy, University Medical Center, Georg-August-Universita¨t, Go¨ttingen, Germany; 7Institute of Neuroscience and Medicine, Juelich, Germany and Department of Genomics, Life and Brain Center, University of Bonn, Bonn, Germany and 8Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Mannheim, Mannheim, Germany Meta-analyses of bipolar disorder (BD) genome-wide association studies (GWAS) have identified several genome-wide significant signals in European-ancestry samples, but so far account for little of the inherited risk. We performed a meta-analysis of B750 000 high-quality genetic markers on a combined sample of B14 000 subjects of European and Asian-ancestry (phase I). The most significant findings were further tested in an extended sample of B17 700 cases and controls (phase II). The results suggest novel association findings near the genes TRANK1 (LBA1), LMAN2L and PTGFR. In phase I, the most significant single nucleotide polymorphism (SNP), rs9834970 near TRANK1, was significant at the P = 2.4 Â 10À11 level, with no heterogeneity. Supportive evidence for prior association findings near ANK3 and a locus on chromosome 3p21.1 was also observed. The phase II results were similar, although the heterogeneity test became significant for several SNPs. On the basis of these results and other established risk loci, we used the method developed by Park et al. to estimate the number, and the effect size distribution, of BD risk loci that could still be found by GWAS methods. We estimate that > 63 000 case–control samples would be needed to identify the B105 BD risk loci discoverable by GWAS, and that these will together explain < 6% of the inherited risk. These results support previous GWAS findings and identify three new candidate genes for BD. Further studies are needed to replicate these findings and may potentially lead to identification of functional variants. Sample size will remain a limiting factor in the discovery of common alleles associated with BD. Molecular Psychiatry (2013) 18, 195–205; doi:10.1038/mp.2011.157; published online 20 December 2011 Keywords: ANK3; bipolar disorder; LBA1; meta-analysis; TRANK1; 3p21 Introduction heritability estimated by previous twin studies.1–11 With a lifetime prevalence worldwide between 0.5 The genetic basis of bipolar disorder (BD) is still and 1.5%, BD is characterized clinically by often largely unknown despite robust evidence of high disabling fluctuations of mood and behavior, com- monly developing in late adolescence to early adult- hood. Although the pathogenesis of BD remains Correspondence: Dr DT Chen, Human Genetics Branch, National Institute of Mental Health, Intramural Research Program, National unclear, genome-wide association studies (GWAS) Institutes of Health, US Department of Health and Human have so far identified and replicated a few risk loci Services, 35 Convent Drive, Room 1A-208, Bethesda, MD 20892, (near the genes DGKH, ANK3 and CACNA1C),12–16 USA. along with a locus on chromosome 3p21.1 that harbors E-mail: [email protected] 17 9 anumberofgenes. Together, these loci account for The Bipolar Genome Study (BiGS) authorship list is shown in the little of the BD heritability, suggesting that additional Appendix. Received 31 May 2011; revised 7 October 2011; accepted 17 risk loci remain undiscovered. The total BD GWAS October 2011; published online 20 December 2011 sample size studied, so far, remains low compared Three novel loci associated with bipolar disorder DT Chen et al 196 with many other common traits studied, such as type 2 flagged as having different alleles than in HapMap diabetes, height, serum lipids, colorectal cancer and CEU or as monomorphic were reviewed, after which rheumatoid arthritis.18–23 Some of the missing herit- they were recoded for the reverse-strand (flipped) or ability may be explained by additional risk loci that dropped. SNPs flagged for allele frequencies markedly can only be identified in larger sample sizes.24 different from HapMap CEU were also reviewed. Psychiatric disorders such as BD pose statistical Palindromic SNPs whose allele frequencies were challenges when it comes to very large sample sizes. consistent with reversed coding were flipped. SNPs Phenotyping by direct diagnostic evaluation is expen- with unexpected allele frequencies were dropped. sive. Reliance on physician or hospital-assigned PLINK (version1.4) was used to flip and drop SNPs.34 diagnoses can save money, but introduces potential After all allele-coding, monomorphism and palin- biases, like changing diagnostic-criteria, which can drome issues were resolved, imputation was run be difficult to correct.25 Increasing sample size by again. SNPs in the result files were dropped if the combining data across studies, can be fruitful. Meta- minor allele frequency (MAF) in cases or controls was analysis is an efficient and largely unbiased way < 0.05, or if the error rate (in the .erate output file) was to increase effective sample size by systematically > 0.01. The imputed data were then formatted into combining association signals across studies. As most PLINK binaries for analysis. Supplementary Table 1 common genetic variation is ancient and widespread, provides detailed description regarding genotyping some risk alleles may be shared across continental and imputation for the Taiwan, Wellcome Trust Case– populations. It is possible to use meta-analysis to Control Consortium, STEP-BD,35 FondaMental Bipolar combine study samples of differing ancestry, as long and GlaxoSmithKline (GSK) samples. as appropriate ancestry-matched controls are used within each study.26–28 Meta-analysis In this study, we have sought to identify novel risk PLINK output (assoc) files were modified with alleles for BD by meta-analysis of world-wide BD columns for direction-of-association, sample size GWAS, comprising case–control samples of both and strand. For most samples, sample size equaled European and Asian ancestry. The combined sample the sum of cases and controls included in the final size of 17 656 is the largest so far in BD, to our analysis, after the quality-control steps were com- knowledge. The results suggest significant novel plete. For STEP-BD, sample size was set to equal the association signals near the genes TRANK1 (LBA1), number of cases plus controls that did not overlap LMAN2L and PTGFR, and provide supportive evi- with those in the NIMH-GAIN or TGEN. This was dence for the previously reported association signals done to avoid over-weighting the results from the near ANK3 and within the 3p21.1 locus. Largely con- NIMH-control sample, overlapping portions of sistent signals were observed in both the European which were included in both the NIMH-GAIN, TGEN ancestry and Asian-ancestry samples. Based on these and STEP-BD. Modified files were loaded into findings and discoveries to date, we also present a METAL (November 2010 version), then processed GWAS discovery trajectory for BD. using the GENOMICCONTROL option, which applies a genomic control36 correction in samples where the genomic inflation factor is > 1.0. METAL weights Materials and methods each sample based on the square root of the sample Study samples size. Care was taken to avoid mis-assigning alleles The samples used in the meta-analysis have been when combining results from different samples and described previously, and details are provided in platforms.37 Using the ‘STRANDLABEL’ and ‘USES- Table 1 and Supplementary Table 1.12,15,17,29–32 For TRAND ON’ commands in METAL, these SNPs were phase I, we obtained five European and one Asian- recoded to ensure consistent allele coding across the ancestry sample, totaling B14 000 cases and controls. samples analyzed. Because the German sample was The most significant hits (P <4Â 10À3) were tested in genotyped on the Illumina platform (San Diego, CA, an extended sample that included phase I plus two USA) that contains no palindromic SNPs, we used independent European ancestry samples (B3800 that sample as the gold standard for our study. cases/controls). We refer to this as the phase II sample. Results were combined under a fixed-effects model, using METAL. For initial discovery purposes, the Imputation fixed-effects model is more powerful than the tradi- Genotype data from the NIMH-GAIN, German and tional random-effects model, and Pereira et al.24 TGEN samples were used to impute data on 2.1 suggest the fixed-effects model is preferable, espe- million HapMap phase2 single-nucleotide polymorph- cially when the cumulative sample size is in the range ism (SNPs), by use of the program Markov Chain of 2000–20 000. Haplotyping (MACH version1.0; http://www.sph. After selected results were confirmed, heterogene- umich.edu/csg/abecasis/MACH/download/).33 MACH ity statistics were calculated, using Comprehensive uses Markov chain haplotyping to resolve haplotypes, Meta-analysis version 2.0. When heterogeneity tests and thereby missing genotypes, from observed geno- are significant, assumptions of the fixed-effects model types in unrelated individuals.