Annotation of Functional Variation Within Non-MHC MS Susceptibility Loci Through Bioinformatics Analysis

Annotation of Functional Variation Within Non-MHC MS Susceptibility Loci Through Bioinformatics Analysis

Genes and Immunity (2014) 15, 466–476 & 2014 Macmillan Publishers Limited All rights reserved 1466-4879/14 www.nature.com/gene ORIGINAL ARTICLE Annotation of functional variation within non-MHC MS susceptibility loci through bioinformatics analysis FBS Briggs, LJ Leung and LF Barcellos There is a strong and complex genetic component to multiple sclerosis (MS). In addition to variation in the major histocompatibility complex (MHC) region on chromosome 6p21.3, 110 non-MHC susceptibility variants have been identified in Northern Europeans, thus far. The majority of the MS-associated genes are immune related; however, similar to most other complex genetic diseases, the causal variants and biological processes underlying pathogenesis remain largely unknown. We created a comprehensive catalog of putative functional variants that reside within linkage disequilibrium regions of the MS-associated genic variants to guide future studies. Bioinformatics analyses were also conducted using publicly available resources to identify plausible pathological processes relevant to MS and functional hypotheses for established MS-associated variants. Genes and Immunity (2014) 15, 466–476; doi:10.1038/gene.2014.37; published online 17 July 2014 INTRODUCTION protein structure through alternative splicing within IL7R7 and 8 Multiple sclerosis (MS) is a clinically heterogeneous autoimmune TNFRSF1A. However, the causal variants and the pathological disease of the central nervous system with a complex etiology, biological processes mediated by the remaining 103 loci have not primarily characterized by demyelination and the formation of been fully explored. neurological lesions.1 The prevalence of MS is greatest among Here, we present a comprehensive catalog of candidate Northern Europeans (0.1–0.2%).1 There is a prominent genetic functional variants within linkage disequilibrium (LD) blocks component illustrated by consistently higher disease concordance encompassing the non-MHC MS susceptibility loci to expedite in monozygotic twins compared with dizygotic twins across the prioritization of SNPs for future fine-mapping analyses. We several populations (30% and 5%, respectively), similar to other also conducted several bioinformatics enrichment analyses that 2 autoimmune diseases. The relative recurrence risk (ls)isB6.3 integrated multiple functional (-omic) data for both MS-associated among siblings.3 For several decades, variation within the major genes and the individual non-MHC risk variants. These results histocompatibility complex (MHC) on chromosome 6p21.3 has provide thorough insight into underlying disease mechanisms in been known to confer the strongest genetic risk for MS. The MS based on current knowledge. primary susceptibility locus is a human leukocyte antigen (HLA) class II allele, HLA-DRB1*15:01.1,4,5 Recent fine-mapping of the MHC demonstrated the presence of at least 10 additional RESULTS independent HLA and non-HLA risk alleles for MS.6 Of the 110 non-MHC MS risk variants, 63 SNPs are within genes, Full characterization of the non-MHC genetic component in MS 1 SNP is within a noncoding RNA (PVT1) and the remaining 46 has been challenging; however, 110 non-MHC risk variants have SNPs are intergenic (Table 1). Using Haploreg v2 (2013.02.14; been identified within the last decade through both genome-wide http://www.broadinstitute.org/mammals/haploreg/haploreg.php)9 association studies (GWASs) and follow-up candidate gene studies features of the genic risk SNPs were assessed. Among the genic facilitated by international research initiatives (International risk SNPs, there were three missense SNPs (TYK2, SLC44A2 and Multiple Sclerosis Genetics Consortium and collaborative con- IFI30), two 30 untranslated region (UTR) SNPs (EVI5 and CD69), one 4,5 0 struction of the ImmunoChip). These variants explain 20% of ls, 5 UTR SNP (CD86), one synonymous SNP (TIMMDC1) and one and with the inclusion of the four most prominent HLA risk alleles downstream SNP (EOMES); the remaining 45 SNPs are intronic 5 as much as 28% of ls. These risk variants are both genic (Table 1). With the exception of the TYK2 missense variant, these (63 single-nucleotide polymorphisms (SNPs) in 63 genes) and variants are fairly common with minor allele frequencies 45%. intergenic, similar to most complex diseases. For most of the 63 A total of 92 SNPs reside within regulatory motifs, and 27 SNPs MS-associated genes, their contribution to pathogenesis is reside within transcription factor binding sites. Interestingly, the unclear; however, they appear to be primarily involved in MS risk variant rs4796791 is an intronic STAT3 (a transcription immune response. In the largest MS association study, high- factor) SNP, and three other MS risk variants exist within signal resolution mapping was performed for 8 risk loci; for 5 of them, transducer and activator of transcription 3 (STAT3) binding sites TNFSF14, IL2RA, TNFRSF1A, IL12A and STAT4, 50% of the posterior within their respective genes/noncoding RNA: rs4410871 (PVT1), probability of association was explained by an intronic variant; 3 of rs917116 (JAZF1) and rs2236262 (ZFP36L1). Among all variants, the 5 variants are associated with protein levels.5 Functional there was a significant enrichment of strong enhancers for several experiments have revealed the primary causal variants affecting cell lines, including human embryonic stem cells and leukemia Genetic Epidemiology and Genomics Laboratory, Division of Epidemiology, School of Public Health, University of California, Berkeley, CA, USA. Correspondence: Profesor LF Barcellos, Genetic Epidemiology and Genomics Laboratory, Division of Epidemiology, School of Public Health, 324 Stanley Hall, University of California, Berkeley, CA 94720, USA. E-mail: [email protected] Received 20 February 2014; revised 2 May 2014; accepted 29 May 2014; published online 17 July 2014 & 2014 Macmillan Publishers Limited Table 1. Annotation for the 110 non-MHC MS risk SNPs Chr Base pair SNP ID Ref Alt EUR Published odds Proteins bound Motifs changed Gene Gene location allele allele freq ratio (95% CI)a (transcription factor location binding site) 1 2525665 rs3748817 T C 0.33 1.14 (1.10–1.18) ZNF263 GR, p300 MMEL1 Intronic 1 6530189 rs3007421 G A 0.11 1.12 (1.07–1.18) ERa-a, Nkx2, RAR PLEKHG5 Intronic 1 85746993 rs12087340 C T 0.08 1.22 (1.15–1.29) 4.4 kb 5’ of BCL10 1 85915183 rs11587876 T C 0.21 1.12 (1.07–1.17) P300 DDAH1 Intronic 1 92975464 rs41286801 C T 0.17 1.20 (1.15–1.25) Arid5a, CEBPA, CEBPB, CTCF, EVI5 3’-UTR Pou2f2, STAT 1 101240893 rs7552544 T C 0.44 1.08 (1.05–1.12) Pax-4, Zbtb3 36 kb 3’ of VCAM1 1 101407519 rs11581062 A G 0.29 1.05 (1.01–1.09) SEF-1, YY1 SLC30A7 Intronic 1 117080166 rs6677309 A C 0.14 1.34 (1.27–1.41) POL24H8, AP2ALPHA, Mtf1, Rad21 CD58 Intronic AP2GAMMA, E2F6, CMYC 1 120258970 rs666930 T C 0.54 1.09 (1.06–1.13) EBF, Pou1f1 PHGDH Intronic 1 157770241 rs2050568 C T 0.47 1.08 (1.05–1.12) FCRL1 Intronic 1 160711804 rs35967351 A T 0.3 1.09 (1.05–1.13) GATA2 ATF3, CCNT2, INSM1, SP1, ZBTB33 SLAMF7 Intronic 1 192541472 rs1359062 C G 0.81 1.18 (1.13–1.23) 3.4 kb 5’ of RGS1 1 200874728 rs55838263 A G 0.26 1.12 (1.08–1.17) FOXA1, HNF4G, RXRA, Ik-2, NF-AT1, Pou2f2 C1orf106 Intronic TCF4 2 25017860 rs4665719 C T 0.75 1.09 (1.05–1.13) CENPO Intronic 2 43361256 rs2163226 T C 0.27 1.10 (1.07–1.15) CMYC Irf, Nanog, Spz1 88 kb 3’ of ZFP36L2 2 61095245 rs842639 G A 0.68 1.11 (1.08–1.15) PU1, YY1, ELF1 FLJ16341 Intronic 2 68587477 rs7595717 C T 0.28 1.10 (1.06–1.14) 4.8 kb 5’ of PLEK 2 112665201 rs17174870 C T 0.29 1.03 (1.00–1.07) Maf MERTK Intronic 2 191974435 rs9967792 T C 0.66 1.11 (1.07–1.15) TCF12 STAT4 Intronic 2 231115454 rs9989735 G C 0.18 1.17 (1.12–1.22) POL24H8 AP-1, CAC-binding-protein, CCNT2, SP140 Intronic FBS Briggs A role for T-cell dysregulation in MS pathogenesis E2F, EWSR1-FLI1, Egr-1, MAZ, MAZR, MZF1::1-4, Myc, PU.1, Pou2f2, SP1, STAT, Sp4, TATA, TFII-I, WT1, ZNF263, Zfp281 et al 3 18785585 rs11719975 G C 0.25 1.09 (1.05–1.13) 305 kb 5’ of SATB1 3 27757018 rs2371108 G T 0.4 1.08 (1.05–1.12) Zic 866 bp 3’ of EOMES Downstream 3 28078571 rs1813375 G T 0.49 1.15 (1.12–1.19) EBF1 CCNT2, Cdx, Foxc1, Foxj1, GATA, 205 kb 5’ of CMC1 Nkx3, Sox, TAL1 3 33013483 rs4679081 T C 0.54 1.08 (1.04–1.11) CEBPG, DMRT7, Evi-1, Foxo, HNF1, 17 kb 3’ of CCR4 Hoxa9, Hoxc10, Hoxc9, Irf, Mef2 3 71530346 rs9828629 C T 0.41 1.08 (1.05–1.12) CEBPB, HMG-IY, Myc, Zfp187 FOXP1 Intronic 3 105558837 rs2028597 G A 0.08 1.04 (0.98–1.11) AIRE, AP-4, LBP-1 CBLB Intronic 3 119222456 rs1131265 G C 0.16 1.19 (1.14–1.24) TIMMDC1 Synonymous Genes and Immunity (2014) 466 – 476 3 121543577 rs1920296 A C 0.61 1.14 (1.11–1.18) FAC1, GR IQCB1 Intronic 3 121770539 rs2255214 G T 0.51 1.11 (1.08–1.15) BRCA1, Foxi1, Pou2f2, Pou3f3, TEF 3.7 kb 5’ of CD86 3 121796768 rs9282641 G A 0.09 1.12 (1.05–1.19) NF-kB, POL2, POL24H8, CD86 5’-UTR PU1 3 159691112 rs1014486 T C 0.47 1.11 (1.07–1.14) T3R 16 kb 5’ of IL12A 4 103551603 rs7665090 A G 0.49 1.08 (1.05–1.12) Pou1f1 1 kb 3’ of MANBA 4 106173199 rs2726518 A C 0.6 1.09 (1.05–1.13) PPAR, RAR TET2 Intronic 5 35879156 rs6881706 G T 0.28 1.12 (1.08–1.16) Barhl1, Barx1, Barx2, En-1, Gbx1, 2.2 kb 3’ of IL7R Gbx2, Hlxb9, Ik-1, Ik-2, Isl2, Lhx4, Lhx8, Msx-1, Msx2, Pax-6, Pax7, Phox2a, Pou3f2, Prrx1, Prrx2 5 40399096 rs6880778 A G 0.58 1.10 (1.06–1.14) AP-1, Elf5, HMG-IY, Maf, STAT 281 kb 5’ of PTGER4 5 55440730 rs71624119 G A 0.25 1.12 (1.08–1.17) Bbx, DMRT1, STAT ANKRD55 Intronic 5 133446575 rs756699 C T 0.85 1.12 (1.07–1.18) PLZF, Pax-5 3.8 kb 5’ of TCF7 5 141506564 NA T G 0.61 1.07 (1.04–1.11) NDFIP1 Intronic 5 158759900 rs2546890 A G 0.5 1.06 (1.02–1.09) EBF1, MEF2A, PU1 Ik-2, RBP-Jk LOC285626 Intronic 467 468 Genes and Immunity (2014) 466 – 476 Table 1.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    11 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us