Multiomics Dissection of Molecular Regulatory Mechanisms Underlying Autoimmune-Associated Noncoding Snps
Total Page:16
File Type:pdf, Size:1020Kb
Multiomics dissection of molecular regulatory mechanisms underlying autoimmune-associated noncoding SNPs Xiao-Feng Chen, … , Yan Guo, Tie-Lin Yang JCI Insight. 2020;5(17):e136477. https://doi.org/10.1172/jci.insight.136477. Research Article Genetics More than 90% of autoimmune-associated variants are located in noncoding regions, leading to challenges in deciphering the underlying causal roles of functional variants and genes and biological mechanisms. Therefore, to reduce the gap between traditional genetic findings and mechanistic understanding of disease etiologies and clinical drug development, it is important to translate systematically the regulatory mechanisms underlying noncoding variants. Here, we prioritized functional noncoding SNPs with regulatory gene targets associated with 19 autoimmune diseases by incorporating hundreds of immune cell–specific multiomics data. The prioritized SNPs are associated with transcription factor (TF) binding, histone modification, or chromatin accessibility, indicating their allele-specific regulatory roles. Their target genes are significantly enriched in immunologically related pathways and other known immunologically related functions. We found that 90.1% of target genes are regulated by distal SNPs involving several TFs (e.g., the DNA-binding protein CCCTC-binding factor [CTCF]), suggesting the importance of long-range chromatin interaction in autoimmune diseases. Moreover, we predicted potential drug targets for autoimmune diseases, including 2 genes (NFKB1 and SH2B3) with known drug indications on other diseases, highlighting their potential drug repurposing opportunities. Taken together, these findings may provide useful information for future experimental follow-up and drug applications on autoimmune diseases. Find the latest version: https://jci.me/136477/pdf RESEARCH ARTICLE Multiomics dissection of molecular regulatory mechanisms underlying autoimmune-associated noncoding SNPs Xiao-Feng Chen, Ming-Rui Guo, Yuan-Yuan Duan, Feng Jiang, Hao Wu, Shan-Shan Dong, Xiao-Rong Zhou, Hlaing Nwe Thynn, Cong-Cong Liu, Lin Zhang, Yan Guo, and Tie-Lin Yang Key Laboratory of Biomedical Information Engineering of Ministry of Education and Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China. More than 90% of autoimmune-associated variants are located in noncoding regions, leading to challenges in deciphering the underlying causal roles of functional variants and genes and biological mechanisms. Therefore, to reduce the gap between traditional genetic findings and mechanistic understanding of disease etiologies and clinical drug development, it is important to translate systematically the regulatory mechanisms underlying noncoding variants. Here, we prioritized functional noncoding SNPs with regulatory gene targets associated with 19 autoimmune diseases by incorporating hundreds of immune cell–specific multiomics data. The prioritized SNPs are associated with transcription factor (TF) binding, histone modification, or chromatin accessibility, indicating their allele-specific regulatory roles. Their target genes are significantly enriched in immunologically related pathways and other known immunologically related functions. We found that 90.1% of target genes are regulated by distal SNPs involving several TFs (e.g., the DNA-binding protein CCCTC-binding factor [CTCF]), suggesting the importance of long-range chromatin interaction in autoimmune diseases. Moreover, we predicted potential drug targets for autoimmune diseases, including 2 genes (NFKB1 and SH2B3) with known drug indications on other diseases, highlighting their potential drug repurposing opportunities. Taken together, these findings may provide useful information for future experimental follow-up and drug applications on autoimmune diseases. Introduction Autoimmune diseases are groups of complex immune system disorders with high prevalence rates world- wide (4.5%) (1). High heritabilities have been previously observed on various autoimmune diseases (~60%– 90%) (2). In addition, GWAS have unraveled hundreds of susceptible loci associated with autoimmune diseases (3), suggesting many functional genes involved in some key immunological pathways (e.g., MHC gene clusters in antigen presentation, TYK2 in cytokine signals) (4). However, the true functional variants and target genes for most GWAS variants remain largely unknown (4), and their discovery may be limited Conflict of interest: The authors have by 2 challenges. First, the detected variants may be in linkage disequilibrium (LD) with causal functional declared that no conflict of interest SNPs without genotyping. Second, more than 90% of GWAS variants are located in the uncultivated non- exists. coding regions, complicating their functional interpretation. Recent studies have integrated functional epigenetic data to predict noncoding SNP function. Many of Copyright: © 2020, Chen et al. This is an open access article published under these methods, such as CADD (5), DeepSEA (6), GWAVA (7), FATHMM-MKL (8), ReMM (9), and FIRE the terms of the Creative Commons (10), adopted machine-learning algorithms to develop classifiers through integrating various annotations Attribution 4.0 International License. and labeled training data to distinguish potential functional and nonfunctional SNPs. However, the prior Submitted: January 17, 2020 labeled training data may be both inaccurate and impractical owing to the current knowledge limitation Accepted: July 16, 2020 in functional roles underlying noncoding SNPs. Other methods, such as RegulomeDB (11), 3DSNP (12), Published: September 3, 2020. GWAS4D (13), IW-Scoring (14), Eigen (15), and FunSeq2 (16), either directly combined various epigenetic regulatory features to rank SNP functionality or adopted a weighted scoring scheme by considering the Reference information: JCI Insight. 2020;5(17):e136477. relative importance of each feature to assign SNP functionality scores. However, these approaches incor- https://doi.org/10.1172/jci. porated epigenetic or transcriptional annotation across all cells or tissues, omitting the cell- or context-spe- insight.136477. cific regulation, aiming to prioritize potential functional variants rather than to dissect the downstream insight.jci.org https://doi.org/10.1172/jci.insight.136477 1 RESEARCH ARTICLE regulatory circuits linking functional variants to disease etiology. Autoimmune disease–associated variants are significantly enriched in blood cell-specific enhancers (17), implying that the integration of cell-specific functional data are required for dissecting molecular regulatory mechanisms underlying noncoding vari- ants associated with autoimmune diseases. The incorporation of cell-specific multiomics data has accelerated the decryption of functional mecha- nisms underlying noncoding GWAS variants (18, 19). Recently, we identified a functional SNP, associated with 2 autoimmune diseases, that exerted allele-specific enhancer regulation onIRF5 expression through long-rang loop formation (20). Nevertheless, these studies primarily focused on one GWAS susceptibility loci on one disease, and only a limited number of functional causal variants predisposing to autoimmune diseases have been validated (20). The autoimmune diseases share substantial common susceptibility vari- ants and immunopathology (21). Therefore, deciphering the functions of GWAS noncoding variants sys- tematically is essential for accelerating GWAS findings into useful biological and clinical insight into the causes of autoimmune diseases. To address these issues, we devised an integrative analysis frame to prioritize potential functional non- coding SNPs on 19 autoimmune diseases and further predicted their local and distal regulatory target genes using epigenetic, transcriptional, and 3D chromatin interaction data across hundreds of blood immune cell types (Supplemental Tables 2, 3, and 6; supplemental material available online with this article; https:// doi.org/10.1172/jci.insight.136477DS1). Our analysis contains an integrative functional SNP prioritiza- tion method combing cell-specific epigenetic scoring and allele-specific regulatory analysis. We next evalu- ated the performance of our method by comparing it with other representative methods. We then explored the immunologically related function as well as potential clinical drug applications for predicted target genes. In addition, we analyzed the roles of long-range chromatin interactions on autoimmune SNPs as well as potential key regulatory transcription factors (TFs). Finally, we developed an open web resource (http://fnGWAS.online/) and a local analytical pipeline (https://github.com/xjtugenetics/fnGWAS). Results Integrative analysis prioritized potential noncoding functional autoimmune SNPs with causal target genes. We col- lected 18,857 autoimmune noncoding tag SNPs predisposed to 19 distinct autoimmune diseases (report- ed P < 5×10–8) from multiple resources (3, 22, 23) (Supplemental Table 1). LD analysis retained 51,594 noncoding tags and LD-expanded (r2 > 0.8 in European ancestry) SNPs in 333 genome-wide significant loci (autoimmune-positive SNPs). We next collected 26,922,878 background SNPs in all 333 loci, and collected 47,131,427 negative SNPs beyond these loci (Supplemental Figure 1, further details in Meth- ods). To explore potential key epigenetic regulatory features for autoimmune diseases, we collected 606 epigenetic annotation data across 47 blood immune cell types from Roadmap (24) and