Genetics of RA Susceptibility, What Comes Next?
Total Page:16
File Type:pdf, Size:1020Kb
Rheumatoid arthritis RMD Open: first published as 10.1136/rmdopen-2014-000028 on 15 April 2015. Downloaded from REVIEW Genetics of RA susceptibility, what comes next? James Ding, Stephen Eyre, Jane Worthington To cite: Ding J, Eyre S, ABSTRACT on average, four genes, thus the remaining Worthington J. Genetics of Summary: Genome-wide association studies (GWASs) challenge is to identify for each locus the RA susceptibility, have been used to great effect to identify genetic RMD functional variant(s), the genes on which they what comes next?. susceptibility loci for complex disease. A series of fi Open 2015;1:e000028. act and the speci c cell type in which dysregu- GWAS and meta-analyses have informed the discovery doi:10.1136/rmdopen-2014- lation contributes to disease susceptibility. of over 100 loci for rheumatoid arthritis (RA). In 000028 In addition to providing insights into RA common with findings in other autoimmune diseases the lead signals for the majority of these loci do not pathogenesis, understanding genetic suscep- ▸ Prepublication history for map to known gene sequences. In order to realise the tibility at this level has the potential to this paper is available online. benefit of investment in GWAS studies it is vital we impact on both drug discovery and the To view these files please optimal implementation of established ther- visit the journal online determine how disease associated alleles function to (http://dx.doi.org/10.1136/ influence disease processes. This is leading to rapid apies through a greater understanding of rmdopen-2014-000028). development in our knowledge as to the function of disease heterogeneity. Thus far, functional non-coding regions of the genome. Here we consider studies to further characterise implicated Received 10 February 2015 possible functional mechanisms for intergenic RA- genetic regions in disease have focused Revised 25 March 2015 associated variants which lie within lncRNA sequences. Accepted 28 March 2015 almost exclusively on associated variants in protein-coding genes, however, these account for only a small minority of associated loci. Rheumatoid arthritis (RA) is a polygenic Evidence suggests that for RA and many autoimmune disease that is characterised by other complex diseases the majority of asso- chronic inflammation of the synovial joints, ciated variants map outside protein coding and associated with the presence of antici- regions. http://rmdopen.bmj.com/ trullinated protein antibodies (ACPA). For over 40 years it has been well established that there is a strong association between suscepti- INTERGENIC RA LOCI bility to RA and human leucocyte antigen In contrast to monogenic traits, an enrich- (HLA) DRB1 alleles that share a consensus ment analysis performed using data from sequence, known as the ‘shared epitope’.1 over 100 GWAS publications has shown that Spanning position 70–74 of the β subunit of SNPs associated with complex traits are inter- the HLA DR molecule, this association genic and do not map to protein sequences.6 on September 24, 2021 by guest. Protected copyright. accounts for approximately 40% of the Traditionally, the functional implications of genetic component of susceptibility of RA, intergenic RA loci have been attributed to and has recently been refined following the nearest protein-coding gene known to imputation of amino acid sequences from have immunological relevance. This assump- high-density single nucleotide polymorphism tion, that genetic polymorphisms alter the (SNP) data.2 It is only since the introduction expression of neighbouring genes by affect- Arthritis Research UK Centre of SNP-based genome-wide association ing cis-regulatory elements, such as enhan- for Genetics and Genomics, studies (GWAS) in 2007 that an extensive list cers and promoters, has been demonstrated Centre for Musculoskeletal of additional genetic risk loci, have been to be misguided at a number of loci. For Research, Institute of associated with RA.3 Currently, over 100 risk example, an association between the Inflammation and Repair, loci have been defined, explaining approxi- chromosome 16p13 region and a number of Manchester Academic Health 45 Science Centre, The mately half of the genetic variance in RA. autoimmune diseases has been shown. This University of Manchester, The challenge in this post-GWAS era is to association was originally attributed to Manchester, UK interpret and translate these genetic find- CLEC16A, with the majority of disease asso- ings, defining how they influence disease sus- ciated SNPs found within the transcribed Correspondence to Dr Jane Worthington; ceptibility and outcome. Signals from GWAS region of this gene, especially within intron jane.worthington@ typically comprise large numbers of highly 19. CLEC16A is expressed in a variety of manchester.ac.uk correlated SNPs defining regions that contain, immune relevant cells and contains an Ding J, et al. RMD Open 2015;1:e000028. doi:10.1136/rmdopen-2014-000028 1 RMD Open RMD Open: first published as 10.1136/rmdopen-2014-000028 on 15 April 2015. Downloaded from immunoreceptor tyrosine-based activation motif. enhancers characterised by histone 3 lysine 27 acetyl- Functional studies have, however, revealed a long range ation (H3K27ac).12 chromatin interaction between intron 19 of CLEC16A Lead disease-associated variants, situated as they are in and regulatory sequences that affect the expression of gene regulatory regions, may have a number of functional DEXI, found ∼140 kb from the SNPs.7 The focus of consequences that contribute to increased risk of RA. research at this locus has now moved to DEXI as an auto- These include disruption of transcription factor binding immune candidate gene. A similar example, where the sites and chromatin interactions, as at the SORT1 and SNPs in question are intergenic, rather than intronic, DEXI loci, as well as other mechanisms of disrupting can be found at the chromosome 1p13 region, which is regulatory features such as enhancers, silencers, promo- associated with low-density lipoprotein cholesterol and ters and insulators. Examples include potential impacts myocardial infarction. Here a disease associated SNP dis- on the deposition of epigenetic markers and RNA rupts a transcription factor binding site found between binding. It is also possible for non-coding SNPs to affect CELSR2 and PSRC1, which inhibits the expression of the function of non-coding RNA transcripts, such as SORT1, found ∼35 kb from the putative causal SNP.8 those of eRNA of lncRNA, by altering their sequence or Evidence is now emerging that the variants most expression pattern. strongly associated with autoimmune disease (lead SNPs) are enriched in areas of the chromosome that; are open and active, are defined by epigenetic marks, LNCRNA regulate gene transcription, and act in a cell-specific lncRNA are traditionally defined by an arbitrary lower manner. Indeed, a recent study of 21 autoimmune dis- limit of 200 bases, with over 100 000 transcripts identi- 13 eases by Farh et al,9 described an enrichment of putative fied in humans. Their diverse functions and character- causal SNPs in immune-cell enhancers. The majority istics have only recently attracted the attention of the (∼60%) of these SNPs, associated with RA and 20 other scientific community (see box 1 for a description of autoimmune diseases, map to enhancer elements that various lncRNA categories). Further classification of are active in lymphoblasts, lymphocytes and macro- lncRNA has focused on genomic content, with over a phages. The RA-associated enhancers are enriched for; third of transcripts found in intergenic regions (long cell specificity (T and B cells), activity in response to intervening non-coding RNA, lincRNA). While advances stimulation, and size (large ‘super-enhancers’). in RNA sequencing (RNA Seq) techniques can account Furthermore, they are associated with the expression of for many of the lncRNA species identified to date, a non-coding RNAs (enhancer-derived RNAs, eRNAs also large number of lincRNA have been identified by having known as elncRNAs). This was shown to be true within a similar chromatin signature to that of other actively transcribed genes. Typically, trimethylation of histone 3 the subset of RA-associated super-enhancers in the ana- http://rmdopen.bmj.com/ lysis by Farh et al, but is best illustrated by Hnisz et al10 in lysine 4 is (H3K4me3) is observed at the promoter, with one of the first papers to describe super-enhancers. histone 3 lysine 36 trimethylation (H3K36me3) found Super-enhancers, consist of clusters of enhancer ele- along the transcribed length. ments that play a key role in defining cell identity and lincRNAs are typically transcribed by RNA Polymerase are associated with a 24.3-fold enrichment of RNA, com- II and often undergo polyadenylation and splicing, just 14 pared to typical enhancers. Fahr et al found a 3.2-fold like protein-coding transcripts. This is, however, not enrichment of RA SNPs in super-enhancers compared the case for some eRNAs, which are subdivided accord- to typical enhancers and speculated that these SNPs ing to whether they are unidirectionally or bidirection- on September 24, 2021 by guest. Protected copyright. could elicit subtle and stimulation specific changes in ally transcribed (1D-eRNA, 2D-eRNA). eRNA are gene expression. Defining the function of these ncRNA expressing regions will be pivotal if we are to translate GWAS findings into mechanisms of susceptibility. Box 1 Summary of lncRNA features and classification. By recruiting various proteins, enhancers mediate Long non-coding RNA (lncRNA) chromosomal interactions, enabling the assembly of ▸ Defined as over 200nt in length. Generally transcribed by RNA transcriptional machinery at promoter regions. It is well Polymerase II, spliced and polyadenylated, as for protein- established that these regulatory elements do not neces- coding genes. sarily affect the transcription of neighbouring genes, can ▸ lncRNA are often characterised according to genomic position function over large distances and generally affect the for example, exonic, intergenic (lincRNA), etc. expression of more than one gene.11 One of the hall- ▸ Other, functional categories have been proposed.