Cell-Type-Specific Expression Quantitative Trait Loci Associated with Alzheimer Disease in Blood and Brain Tissue
Total Page:16
File Type:pdf, Size:1020Kb
Patel et al. Translational Psychiatry (2021) 11:250 https://doi.org/10.1038/s41398-021-01373-z Translational Psychiatry ARTICLE Open Access Cell-type-specific expression quantitative trait loci associated with Alzheimer disease in blood and brain tissue Devanshi Patel1,2, Xiaoling Zhang2,3,JohnJ.Farrell2, Jaeyoon Chung 2,ThorD.Stein4,5,6, Kathryn L. Lunetta 3 and Lindsay A. Farrer 1,2,3,7,8 Abstract Because regulation of gene expression is heritable and context-dependent, we investigated AD-related gene expression patterns in cell types in blood and brain. Cis-expression quantitative trait locus (eQTL) mapping was performed genome-wide in blood from 5257 Framingham Heart Study (FHS) participants and in brain donated by 475 Religious Orders Study/Memory & Aging Project (ROSMAP) participants. The association of gene expression with genotypes for all cis SNPs within 1 Mb of genes was evaluated using linear regression models for unrelated subjects and linear-mixed models for related subjects. Cell-type-specific eQTL (ct-eQTL) models included an interaction term for the expression of “proxy” genes that discriminate particular cell type. Ct-eQTL analysis identified 11,649 and 2533 additional significant gene-SNP eQTL pairs in brain and blood, respectively, that were not detected in generic eQTL analysis. Of note, 386 unique target eGenes of significant eQTLs shared between blood and brain were enriched in apoptosis and Wnt signaling pathways. Five of these shared genes are established AD loci. The potential importance and relevance to AD of significant results in myeloid cell types is supported by the observation that a large portion of fi 1234567890():,; 1234567890():,; 1234567890():,; 1234567890():,; GWS ct-eQTLs map within 1 Mb of established AD loci and 58% (23/40) of the most signi cant eGenes in these eQTLs have previously been implicated in AD. This study identified cell-type-specific expression patterns for established and potentially novel AD genes, found additional evidence for the role of myeloid cells in AD risk, and discovered potential novel blood and brain AD biomarkers that highlight the importance of cell-type-specific analysis. Introduction disease and AD risk alleles polarized for monocyte- Recent expression quantitative trait locus (eQTL) ana- specific eQTL effects7. In addition, disease and trait- lysis studies suggest that changes in gene expression have associated cis-eQTLs were more cell-type-specific than a role in the pathogenesis of AD1,2. However, regulation of average cis-eQTLs7. Another study classified 12% of more gene expression, as well as many biological functions, has than 23,000 eQTLs in blood as cell-type-specific4. Large been shown to be context-specific (e.g., tissue and cell eQTL studies across multiple human tissues have been types, developmental time point, sex, disease status, and conducted by the GTEx consortium, with a study on – response to treatment or stimulus)3 6. One study of 500 genetic effects on gene expression levels across 44 human healthy subjects found over-representation of T cell- tissues collected from the same donors characterizing specific eQTLs in susceptibility alleles for autoimmune patterns of tissue specificity recently published8. Microglia, monocytes, and macrophages share a similar developmental lineage and are all considered to be mye- Correspondence: Lindsay A. Farrer ([email protected]) loid cells9. It is believed that a large proportion of AD 1Bioinformatics Graduate Program, Boston University, Boston, MA, USA 2 genetic risk can be explained by genes expressed in Department of Medicine (Biomedical Genetics), Boston University School of 10 Medicine, Boston, MA, USA myeloid cells and not other cell types . Several Full list of author information is available at the end of the article © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a linktotheCreativeCommons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Patel et al. Translational Psychiatry (2021) 11:250 Page 2 of 17 established AD genes are highly expressed in micro- principal component of ancestry. ROSMAP gene expres- glia9,11, and a variant in the AD-associated gene CELF1 sion data were log-normalized and adjusted for known has been associated with lower expression of SPI1 in and hidden variables detected by surrogate variable ana- monocytes and macrophages10. AD risk alleles have been lysis (SVA)17 in order to determine which of these vari- shown to be enriched in myeloid-specific epigenomic ables should be included as covariates in analysis models annotations and in active enhancers of monocytes, mac- for eQTL discovery. Additional filtering steps of FHS and rophages, and microglia12, and to be polarized for cis- ROSMAP GWAS and gene expression data included eQTL effects in monocytes7. These findings suggest that a eliminating subjects with missing data, restricting gene cell-type-specific analysis in blood and brain tissue may expression data to protein-coding genes, and retaining identify novel and more precise AD associations that may common variants (MAF ≥ 0.05) with good imputation help elucidate regulatory networks. In this study, we quality (R2 ≥ 0.3). performed a genome-wide cis ct-eQTL analysis in blood and brain, respectively, then compared eQTLs and cell- Cis-eQTL mapping type-specific eQTLs (ct-eQTLs) between brain and blood Cis-eQTL mapping was performed using a genome- with a focus on genes, loci, and cell types previously wide design (Supplementary Fig. S1). The association of implicated in AD risk by genetic approaches. gene expression with SNP genotypes for all cis SNPs within 1 Mb of protein-coding genes was evaluated using Materials, subjects and methods linear-mixed models adjusting for family structure in FHS Study cohorts and linear regression models for unrelated individuals in Framingham Heart Study (FHS) ROSMAP. In FHS, lmekin function in the R kinship The FHS is a multigenerational study of health and disease package (version 1.1.3)18 was applied assuming an addi- in a prospectively followed community-based and primarily tive genetic model with covariates for age and sex, and non-Hispanic white sample. Procedures for assessing family structure modeled as a random-effects term for dementia and determining AD status in this cohort are kinship—a matrix of kinship coefficients calculated from described elsewhere13. Clinical, demographic, and pedigree pedigree structures. The linear model for analysis of FHS information, as well as 1000 Genomes Project Phase 1 can data be expressed as follows: imputed SNP genotypes and Affymetrix Human Exon 1.0 ST array gene expression data from whole blood, were obtained Yi ¼ I þ β1Gj þ β2Aij þ β3Sij þ Uij þ εij from dbGaP (https://www.ncbi.nlm.nih.gov/projects/gap/cgi- bin/study.cgi?study_id=phs000007.v31.p12). Requisite infor- where Yi is the expression value for gene i, Gj is the mation for this study was available for 5257 participants. genotype dosage for cis SNP j, Aij and Sij are the Characteristics of these subjects are provided in Supple- covariates for age and sex respectively, Uij is the random mentary Table S1. effect for family structure, and β1, β2, and β3 are regression coefficients. Religious Orders Study (ROS)/Memory and Aging Project ROSMAP data were analyzed using the lm function in (MAP) the base stats package in R (http://www.R-project.org/). ROS enrolled older nuns and priests from across the The regression model, which included covariates for age, US, without known dementia for longitudinal clinical sex, postmortem interval (PMI), study (ROS or MAP), and analysis and brain donation and MAP enrolled older a term for a surrogate variable (SV1) derived from analysis subjects without dementia from retirement homes who of high dimensional data, can be expressed as: agreed to brain donation at the time of death14 (http:// www.eurekaselect.com/99959/article). RNA-sequencing Yi ¼ I þ β1Gj þ β2Aij þ β3Sij þ β3Sij þ β4PMij þ β5ijS2 þ β6ijSV1 þ εij brain gene expression and whole-genome sequencing (WGS) genotype data were obtained from the AMP-AD where Yi is the expression value for gene i, Gj is the knowledge portal (https://www.synapse.org/#!Synapse: genotype dosage for cis SNP j, Aij, Sij, PMij, S2ij, and SV1ij syn3219045)(https://www.synapse.org/). are the covariates for age, sex, PMI, study, and SV1, respectively, ɛij is the residual error, and the βs are Data processing regression coefficients. Generation, initial quality control (QC), and pre- processing procedures of the FHS GWAS and expres- Cis ct-eQTL mapping sion data are described elsewhere13. Briefly, the Robust Models testing associations with cell-type-specific Multichip Average (RMA) method15,16 was used for eQTLs (ct-eQTLs) included an interaction term for background adjustment and normalization of gene expression levels of “proxy” genes that represent cell expression levels and further adjusted for the first types. Proxy genes representing ten cell types in whole Patel et al. Translational Psychiatry (2021) 11:250 Page 3 of 17 – blood4 and five cell types in brain19 21 were incorporated which include genome-wide significant “peak” SNPs (i.e., in cell-type-specific models (Supplementary Table S2). top-ranked SNP within an association signal) for AD.