Independent Methylome-Wide Association Studies Of
Total Page:16
File Type:pdf, Size:1020Kb
applyparastyle "fig//caption/p[1]" parastyle "FigCapt" applyparastyle "fig" parastyle "Figure" Schizophrenia Bulletin vol. 46 no. 2 pp. 319–327, 2020 doi:10.1093/schbul/sbz056 Advance Access publication 5 June 2019 Independent Methylome-Wide Association Studies of Schizophrenia Detect Consistent Case–Control Differences Downloaded from https://academic.oup.com/schizophreniabulletin/article-abstract/46/2/319/5511546 by University of Exeter user on 28 April 2020 Robin F. Chan1, Andrey A. Shabalin1, Carolina Montano2, Eilis Hannon3, Christina M. Hultman4, Margaret D. Fallin5, Andrew P. Feinberg6, Jonathan Mill3, Edwin J. C. G. van den Oord1, and Karolina A. Aberg*,1 1Center for Biomarker Research and Precision Medicine, School of Pharmacy, Virginia Commonwealth University, Richmond, VA; 2Department of Medicine, Boston Children’s Hospital, Boston, MA; 3University of Exeter Medical School, University of Exeter, Exeter, UK; 4Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden; 5Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD; 6Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD *To whom correspondence should be addressed; tel: 804-628–3023, fax: +1 804 -628-3991, e-mail: [email protected] Methylome-wide association studies (MWASs) are prom- Introduction ising complements to sequence variation studies. We DNA methylation is a dynamic epigenetic modifica- used existing sequencing-based methylation data, which tion that provides the cellular phenotype both stability assayed the majority of all 28 million CpGs in the human and diversity. Investigations of these modifications pro- genome, to perform an MWAS for schizophrenia in blood, vide promising complements to schizophrenia studies while controlling for cell-type heterogeneity with a re- of sequence variation. In 2014, Aberg et al1 performed cently generated platform-specific reference panel. Next, the first large-scale methylome-wide association study we compared the MWAS results with findings from 3 ex- (MWAS) for schizophrenia, using methyl-binding do- isting large-scale array-based schizophrenia methylation main sequencing (MBD-seq), which assays nearly all ~28 studies in blood that assayed up to ~450 000 CpGs. Our million CpGs in the human genome.2,3 Since then, 2 in- MWAS identified 22 highly significant lociP ( < 5 × 10−8) dependent research teams, Hannon et al4 and Montano and 852 suggestively significant loci P( < 1 × 10−5). The et al5 have reported large-scale array-based methylation top finding P( = 5.62 × 10−11, q = 0.001) was located in studies for schizophrenia, which assay up to ~450 000 MFN2, which encodes mitofusin-2 that regulates Ca2+ CpGs that also accounted for cell-type proportions. transfer from the endoplasmic reticulum to mitochondria In the current investigation, we used a recently generated in cooperation with DISC1. The second-most significant MBD-seq specific leukocyte reference panel6 that allows us site (P = 1.38 × 10−9, q = 0.013) was located in ALDH1A2, to perform MWAS for schizophrenia with the MBD-seq which encodes an enzyme for astrocyte-derived retinoic data while accounting for cell-type heterogeneity.7 Next, acid—a key neuronal morphogen with relevance for schiz- we used a multi-marker approach to investigate the cu- ophrenia. Although the most significant MWAS findings mulative MWAS signal across the top methylation sites. were not assayed on the arrays, we observed significant Furthermore, we use a permutation-based test to examine enrichment of overlapping findings with 2 of the 3 array whether overlap between the new MBD-seq MWAS and the datasets (P = 0.0315, 0.0045, 0.1946). Overrepresentation array-based methylation studies demonstrated enrichment. analysis of Gene Ontology terms for the genes in the To the best of our knowledge, this is the first time that the significant overlaps suggested high similarity in the bio- results from multiple large-scale methylation datasets for logical functions detected by the different datasets. Top schizophrenia have been compared in this manner. terms were related to immune and/or stress responses, cell adhesion and motility, and a broad range of processes es- sential for neurodevelopment. Methods Two Approaches for Methylome-Wide Association Studies Key words: DNA methylation/MBD-seq/methylome- Whole genome bisulfite sequencing (WGB-seq), where wide association study (MWAS)/schizophrenia bisulfite converted genomic DNA is sequenced to assess © The Author(s) 2019. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: [email protected] 319 R. F. Chan et al the methylation status of each site across the genome, is with Finnish parents) and ancestry estimated from geno- often considered the “gold standard” in DNA methyla- type data did not substantially contribute to methylation tion research. However, WGB-seq remains too costly for variation, and thus were not included as covariates in the large-scale projects. Other approaches, such as MBD-seq MWAS. To avoid confounding due to cell-type hetero- and the microarray-based Infinium Human Methylation geneity, we also included covariates for blood cell-type 450K and MethylationEPIC assays, have been more fre- proportions estimated from the methylation data7 using Downloaded from https://academic.oup.com/schizophreniabulletin/article-abstract/46/2/319/5511546 by University of Exeter user on 28 April 2020 quently used for MWAS. Although both MBD-seq and reference methylomes.6 These reference methylomes were array approaches are suitable for MWAS they have dif- generated with MBD-seq after isolating all common cell ferent advantages and limitations. types in blood as previously described.6 Finally, we used In MBD-seq, genomic DNA is fragmented into principal component analysis (PCA) as implemented short ~150 bp fragments and a protein with high af- in RaMWAS to capture major unmeasured sources of finity for methylated CpGs is used to capture methylated variation remaining after accounting for the measured fragments, which is then sequenced.8 With this approach covariates. The PCA is comparable with surrogate variable the larger portion of the genome that lacks CpGs, and is analysis and also uses latent variables, but allows for the not methylated, is not sequenced. In practice, this allows >18 million CpG sites to be investigated simultaneously.6 for assaying the majority of all 28 million CpGs in the Using a scree test (supplementary figure S1) we selected the genome.3,9 However, it is often challenging to distinguish first principal component that explained 1.17% of the re- between neighboring CpGs in regions where multiple maining methylation variation for inclusion in the MWAS. CpGs are closely packed. The methylation arrays use bisulfite converted DNA Determining the Significance of the Cumulative MWAS like WGB-seq, but instead of sequencing the entire con- Signal by Resampling verted genome, assay selected sites by hybridization to a microarray. This allows for methylation at single CpGs to To study the significance of the combined MWAS sig- be distinguished. However, current arrays are limited in nals from associated methylation sites, we used the scope and only assay a small percentage (~3%–4%) CpGs. “ramwas7riskScoreCV” function in RaMWAS. This 13–15 In this study, we use four existing large-scale datasets function uses elastic-nets in combination with 10-fold 16 for schizophrenia that were generated with either MBD- cross-validation to obtain unbiased predictions of case– seq or array-based technology. control status for each individual. By testing whether these predictions are significantly correlated with actual schizophrenia status, we performed an “in sample rep- Existing Methylation Sequencing Data lication” of the cumulative MWAS signal. Additional Our primary study sample included existing methylome- details are provided in the supplementary materials. wide sequencing data from a Swedish schizophrenia 1 cohort. The methylomes of DNA extracted from Permutation-Based Enrichment Test of Overlap and blood were assayed using MBD-seq as previously Gene Ontology Analyses described.1,10 In the current study, sequencing reads were realigned (hg19/GRCh37) using an improved To further study the robustness and potential biological aligner Cushaw3.11 Data were then processed using the relevance of the observed associations we used the shiftR 17 RaMWAS Bioconductor package,12 which is an effi- R-package, described previously, with 2 thresholds (top cient analysis tool designed for large-scale methylation 1% and top 5% of results) to perform permutation-based studies. A description of the Swedish cohort and data enrichment tests of the overlap between our MWAS top processing is presented in the supplementary materials results and other enrichment datasets. To account for and supplementary table S1. After quality control, “multiple testing”, the same thresholds are used in the methylation data from 1448 subjects (744 schizophrenia permutations where the test statistic distribution under and 704 controls) for 18 793 496 CpGs was available for the null hypothesis is generated from the most significant statistical analysis. (combination of) thresholds. The enrichment datasets included 3 previous large-scale methylation studies for schizophrenia (described later) for which we used 1 mil- Methylome-Wide Association Study Controlling for lion permutations