CHAPTER 12 GENETICS OF TYPE 1 DIABETES Stephen S. Rich, PhD, Henry Erlich, PhD, and Patrick Concannon, PhD
Dr. Stephen S. Rich is Director of the Center for Public Health Genomics, University of Virginia, Charlottesville, VA. Dr. Henry Erlich is Senior Scientist at the Children’s Hospital Oakland Research Institute, Oakland, CA, and past Director of Human Genetics and Vice- President of Discovery Research, Roche Molecular Systems, Pleasanton, CA. Dr. Patrick Concannon is Director of the Genetics Institute, University of Florida, Gainesville, FL.
SUMMARY
Type 1 diabetes is a complex disease that has both genetic and environmental determinants. Based on twin and family studies, the estimated contribution of genetic factors to type 1 diabetes risk is ~50%. Genes and their variants within the human major histocompat- ibility complex (MHC), including the human leukocyte antigen (HLA) class I (HLA-A, -B, and -C) and class II (HLA-DR, -DQ, and -DP) loci, account for about one-half of the genetic risk of type 1 diabetes. Three amino acid positions in HLA-DQ and HLA-DR define ~90% of the variation in the MHC, with evidence of interactions between pairs of HLA haplotypes that affect antigen binding. Other major contribu- tors to type 1 diabetes genetic risk have been identified through candidate gene and linkage studies and include variants in or near the INS, CTLA4, IL2RA, and PTPN22 genes. Genome-wide association approaches have revealed additional loci containing common vari- ants with relatively small individual effects on type 1 diabetes risk. International efforts led by the Type 1 Diabetes Genetics Consortium and others have identified over 40 non-MHC loci and narrowed the likely candidate genes and variants substantially. The majority of non-MHC variants affect gene regulation rather than directly altering protein structure. Both analytic and molecular work are required to assess the functional significance of the variants in type 1 diabetes susceptibility genes in order to identify critical biologic pathways that could lead to novel interventions and therapeutics.
INTRODUCTION
The natural history of type 1 diabetes environmental exposures than twins. In of diseases (e.g., type 1 diabetes) early is based on precipitating events in an population studies, the estimated preva- in life is more likely to be “genetic” than individual with genetic susceptibility. The lence of type 1 diabetes in non-Hispanic when the disease has onset later in life. As evidence for genetic factors contributing whites is approximately 4 per 1,000; thus, a result, estimates of the impact of genetic to type 1 diabetes risk comes from twin the increased risk in siblings (8%) relative factors on risk of type 1 diabetes in adults and family studies that estimate the to the prevalence in the general population (of any population ancestry) are lacking. familial aggregation of the disease based (4/1,000) is consistent with a major genetic on risk in relatives of an affected indi- contribution to type 1 diabetes risk (5,6). The genetic risk ratio in siblings (λS), vidual. In monozygotic twins, who share defined by the ratio of the sibling risk 100% of their genes, when a member of Type 1 diabetes is most common in to the population prevalence, is ~16 in the pair has type 1 diabetes, the risk to non-Hispanic whites (i.e., populations of type 1 diabetes of European ancestry (7), the co-twin is ~50%, suggesting that both European ancestry); it also occurs in those much higher than that in type 2 diabetes, genetic and nongenetic factors contribute of African, Hispanic, and Asian ancestry for example, yet this figure provides no to risk (1). This concordance increases to but at decreasing prevalence and inci- insight into either the number of genes 65% by age 60 years and 89% for auto- dence. Despite extensive epidemiologic contributing to type 1 diabetes risk or antibody-positive pairs (2). Presence of data on prevalence of type 1 diabetes in the sizes of their effects. Although ~50% high-risk genotypes of the human leukocyte varied ethnic groups, there is a critical of the risk of type 1 diabetes can be antigen (HLA) loci (HLA-DR3, -DR4) in the absence of data on risk to siblings in attributed to genetic (familial) factors, an major histocompatibility complex (MHC) diverse populations, preventing estimation additional question can be asked about (3) and the INS gene (4) are found in higher of genetic impact on type 1 diabetes risk the extent of this genetic risk accounted frequency in concordant pairs, suggesting in nonwhite populations. Although nearly by specific genes and their variants. For a major impact of genetic factors and one-half of type 1 diabetes in non-Hispanic example, in families with two children with heterogeneity in concordance rates. The white populations is diagnosed after age type 1 diabetes, the expectation under population risk is decreased to ~8% in dizy- 20 years, there is a similar absence of the hypothesis that a variant in a gene gotic twin pairs, similar to the risk observed information on the risk to siblings of adult- has no effect on type 1 diabetes risk is for siblings, who also share 50% of their onset type 1 diabetes. This lack of data that 25% of sibling pairs would share both genes, but have a lesser extent of common can be attributed to the concept that onset copies of the variant (same genotype),
Received in final form December 27, 2015. 12–1 DIABETES IN AMERICA, 3rd Edition
50% would share one copy of the variant that account for the majority of type 1 analyses (alleles transmitted from parents in the genotype, and 25% would share no diabetes genetic risk. The second was to to affected offspring are “case” alleles, copies of the variant (completely different identify the non-MHC genes that account while those not transmitted are “control” genotypes). In fact, the risk to siblings who for the remaining 50% of genetic risk alleles). The case-control approach was share two HLA haplotypes is 55%, much for type 1 diabetes. Two primary study used to determine the frequency of the greater than 25%, indicating that a signif- designs were used for these efforts: alleles in cases and controls, with a signif- icant proportion of the total genetic risk family-based (typically families with two icant difference in frequency suggesting can be attributed to the HLA region (7). parents and two type 1 diabetes-affected an association with type 1 diabetes for children) and case-control (a series of that genetic variant. As collection of large The recognition that ~50% of risk for type type 1 diabetes cases and a series of numbers of affected sibpair (ASP) families 1 diabetes was genetic and approximately unaffected controls). The family-based (for robust statistical power) was difficult, one-half of the genetic risk was due to designs were used for linkage analyses much of the genetic evaluation of genes factors in the MHC provided the stimulus (co-segregation of the alleles transmitted contributing to type 1 diabetes employed for two subsequent research paths. The from the parents to the affected children, a case-control design. The following first was to delineate the specific genes with evidence of increased sharing of sections expand on the evolution of the and mechanisms in the MHC (the HLA genotype in the affected children) or for genetic technology and findings related to genes and also other genes and variants) association in the presence of linkage the genetic basis of type 1 diabetes.
MAJOR HISTOCOMPATIBILITY COMPLEX AND TYPE 1 DIABETES RISK
Early efforts (i.e., 1970s to 2000) to FIGURE 12.1. Human Leukocyte Antigen Major Histocompatibility Complex localize and identify genes that contribute to the occurrence of type 1 diabetes, HLA as well as other autoimmune diseases, focused on genes involved in the immune MHC Complex response. Obvious candidates, in this regard, were genes encoding the highly HLA-A polymorphic HLA molecules that play 21.32p critical roles in the immunologic distinc- tion between self and non-self, as well 21.31p as in the presentation of antigens to p the cellular immune system. In humans, the MHC is a gene-rich region on chro- 21.2p mosome 6p21.3 that includes genes HLA-C encoding the HLA class I (HLA-A, -B, and centromere -C) and class II (HLA-DR, -DQ, and -DP) molecules (Figure 12.1). The importance of the MHC in type 1 diabetes risk is HLA-B likely through its role in the presentation of peptide antigens to T cells; genetic variation in this system could act centrally by interfering with the tolerance of q lymphocytes during their maturation in HLA-DR arm the thymus, or peripherally by altering the repertoire of antigens presented. The immune response is centered on antigen presentation, in which foreign antigens HLA-DQ are recognized by antigen-presenting cells (APCs), processed into peptides, complexed to the MHC, and presented HLA-DP on the cell surface, where they can potentially be recognized by T cells human chromosome 6 (Figure 12.2). As the T cell does not recognize “free” foreign peptides, the Genomic localization of the human major histocompatibility complex (MHC) on chromosome 6p21.3 and positions of processing requires APCs and the action the human leukocyte antigen (HLA) class I (HLA-A, -B, -C) and class II (HLA-DR, -DQ, -DP) loci are shown. SOURCE: Reference 44
12–2 Genetics of Type 1 Diabetes
FIGURE 12.2. Antigen Presentation and T Cell Activation Although statistical methods can be used to construct HLA haplotypes from unrelated individuals, the most precise method for observation and estimation of haplotypic association comes from family studies. As noted above, family studies have been conducted in non-Hispanic white populations, with a high prevalence and typically restricted to those probands with onset of type 1 diabetes before age 16 years (but onset up to age 35 years in siblings). A common measure of association is the odds ratio (OR), an epidemiologic statistic that measures the relationship between an exposure (in this case, genotype or haplotype) and an outcome (type 1 diabetes). As the odds Antigen is linked to APC via multiple mechanisms (step 1) followed by MHC class II recognition, antigen processing ratio is used to compare the relative odds (step 2), and presentation to T cells for removal (step 3). APC, antigen-presenting cell; MHC, major histocompatibility complex. of the outcome given the exposure, it can SOURCE: Reference 45, copyright © 2012 Frontiers, reprinted with permission be interpreted as “risk” (OR >1), “protec- tion” (OR <1), or neutral (OR 1). A family of a number of other genes, some also of peptides, there can also be specificity study of type 1 diabetes (9) demonstrated located in the HLA region, to complex as to which peptides are bound. The vari- that the strongest class I associations the peptide with MHC. Thus, a critical ation permits both a broad response to with type 1 diabetes occurred with the aspect of protection is the ability of the peptides with a few highly variable genes HLA-B8 (OR 3.20) and HLA-B15 (OR 3.69) MHC to provide a broad array of options and also provides protection in a popu- alleles and the class II associations with to receive the foreign peptide, with these lation of individuals from a new foreign HLA-DR2 (OR 0.21), HLA-DR3 (OR 3.54), options defined by genetic variation. peptide due to extreme diversity. HLA-DR4 (OR 6.81), HLA-DR5 (OR 0.30), and HLA-DR7 (OR 0.24) alleles. Presence The genes in the MHC contain many alter- The initial report of a strong association of the HLA-B8, -B15, -DR3, and -DR4 native forms (many alleles) and thereby of HLA with type 1 diabetes occurred alleles increased type 1 diabetes risk, provide extensive variation in humans, in 1973 with alleles of the class I HLA-A while the presence of HLA-DR2, -DR5, perhaps the most variable in the human (or, then, HL-A) locus (8). Many reports and -DR7 decreased type 1 diabetes genome. This variation is critical, as the confirmed and extended the association risk. However, the haplotypes formed by MHC controls a major part of the human of type 1 diabetes with antigens/alleles of HLA-B7-DR2 (OR 0.10) and HLA-B15-DR4 immune response through the interac- the class I (HLA-A and -B) and class II (HLA- (OR 7.55) exhibited greater extremes of tions of its cell surface molecules with DR, -DQ, and -DP) loci. Importantly, the association than the individual alleles, other molecules or peptides (both from structure of the MHC on 6p21.3 contains suggesting genetic complexity within the itself and from external sources). The pres- a cluster of HLA loci that are physically MHC association with type 1 diabetes. ence of many alleles in each of the HLA close (within 4 Mb, Figure 12.1) and, genes (class I and class II) makes it highly therefore, genetically correlated (in linkage It should be noted that these type 1 likely that any two individuals in a popula- disequilibrium, LD). The extent of the diabetes-associated alleles and haplo- tion (except for monozygotic twins) have LD in the MHC spans the ~4 Mb interval types were obtained from studies that a different combination of MHC alleles. and results in the transmission of HLA primarily recruited Caucasians. The Most cells in the body can use MHC class “haplotypes” from parent to child. As LD association between HLA alleles and I molecules to present foreign peptide to is the occurrence of some combinations of haplotypes with type 1 diabetes remains (CD8+) T cells and “act” as APCs. Typically, alleles at adjacent loci more (or less) often strong across populations of non-Cau- the APCs internalize the foreign peptide than expected from the frequencies of the casian ancestry (e.g., in Asians, a high and display a fragment bound to an MHC individual alleles, groups of alleles on a risk is seen for DRB1*09:01, an allele class II molecule on the cell surface chromosomal segment can be inherited found at low frequency in Caucasians and, (Figure 12.2). The T cell will then recognize as a unit, or as a haplotype. The human therefore, not associated with Caucasian this complex and initiate T cell activation. MHC exhibits strong LD and specific HLA type 1 diabetes risk). There are also ethnic This becomes critical as the focus of the haplotypes are associated with type 1 differences in HLA allele frequencies that MHC is on immune surveillance. Thus, diabetes risk and provide a “fingerprint” differ from Caucasian populations, e.g., while there can be relatively broad ability of the transmission of disease-associated African-specific HLA-DR3 and HLA-DR7 for the MHC to recognize a broad array variation across populations.
12–3 DIABETES IN AMERICA, 3rd Edition haplotypes have opposite effects on HLA-C*05:01 (susceptibility); expressed on the surface of APC. HLA-DQ type 1 diabetes susceptibility than their HLA-A*11:01, HLA-A*32:01, HLA-A*66:01, is determined by a polymorphic α-chain Caucasian counterparts. In addition, there HLA-B*07:02, HLA-B*44:03, (encoded by HLA-DQA1 locus) and are difficulties in assembling large cohorts HLA-B*35:02, HLA-C*16:01, and a polymorphic β-chain (encoded by of non-Caucasian type 1 diabetes cases HLA-C*04:01 (protective) (10,13). HLA-DQB1 locus). In contrast, HLA-DR is or families, given differences in disease determined by a monomorphic α-chain risk among races/ethnicities. The study of HLA allele and haplotype frequencies (encoded by the HLA-DRA locus) paired type 1 diabetes in non-Caucasian popula- differ among populations, perhaps with a highly polymorphic β chain. Most tions, both from an epidemiologic and a contributing to the differences in type 1 DR haplotypes in the population contain genetic perspective, is an important area diabetes prevalence across geographic two loci: DRB1 and a second DRB locus of research that could provide insights on areas. A North-South gradient in type 1 (either DRB3, DRB4, or DRB5). A few type 1 diabetes pathogenesis. diabetes prevalence has been observed, DR haplotypes (e.g., DRB1*01, DRB1*08, with Scandinavians having the highest and DRB1*10) do not have a second Subsequent studies have investigated the type 1 diabetes risk. Sardinia, however, DRB locus. A heterozygous individual association of class I and class II alleles has a very high type 1 diabetes risk, due could encode up to four different DRB with type 1 diabetes, adjusting for LD in part to the founder effect that estab- molecules (15). Thus, type 1 diabetes within the MHC in numerous populations. lished particularly high frequencies of the risk is determined by specific genotype The most comprehensive evaluation type 1 diabetes-susceptible HLA-DR3 and combinations; for example, the of these associations has occurred HLA-DR4 alleles and haplotypes and the genotype HLA-DRB1*03(HLA-DQB1*02)/ at the HLA and Immunogenetics low frequency of the type 1 diabetes-pro- HLA-DRB1*04(HLA-DQB1*03:02) confers Workshop (10). The class I alleles most tective HLA-DR2 allele and haplotype (14). the highest risk and has the highest significantly associated with type1 frequency in type 1 diabetes cases with diabetes are HLA-B*57:01 (protective) Further complexity in the MHC and youngest onset. and HLA-B*39:06 (susceptible). The risk of type 1 diabetes has been HLA-B*57:01 allele has been shown uncovered. Variation in the genes The allelic variations in the critical HLA to be less frequent in cases with type encoding the HLA-DR and HLA-DQ class II genes (HLA-DRB1, HLA-DQA1, 1 diabetes than controls in several molecules is associated with risk of type and HLA-DQB1) are thought to alter populations and is associated with 1 diabetes with odds ratios in excess specific amino acid residues that affect extreme sensitivity to abacavir (11,12), of 10 for susceptible genotypes (e.g., binding of foreign peptides leading to a treatment for HIV-1 that slows the DR3/DR4 or DRB1*03:01-DQB1*02:01/ initiation of the autoimmune disease spread of the virus. The HLA-B*39:06 DRB1*04-DQB1*03:02) or less than 0.1 process. Only recently have independent allele increases the risk on HLA-DR4 for protective genotypes (e.g., DR2/DRX, amino acid positions and large-scale susceptibility haplotypes in Scandinavian where “X” is neither DR3 nor DR4 [or genetic studies been available to examine populations, as well as other risk haplo- DR2]; the dominantly protective haplo- the contribution of HLA class II alleles types (DR3, DR1, and DR8) in larger type is DRB1*15:01-DQB1*06:02). The and their interactions in type 1 diabetes cohorts. Other significantly associated class II molecules encoded by these (16). In a large series of European cases, class I alleles include HLA-A*24:02, genes combine to form heterodimeric controls and families, over 7,000 single HLA-A*02:01, HLA-B*18:01, and (αβ) protein receptors that are typically nucleotide polymorphisms (SNPs) and
FIGURE 12.3. Effect Sizes for Amino Acid Residues
Case Control .5 5.17 .7 .8 3.6 .70 .4 1.28 .6 .6 .5 0. 8 0.16 .3 .4 .4 .2 .3 0.08 0.28 0.75 Frequency 0.72 .2 0.0 .2 .1 0.53 0.72 0.80 .1 DAVS SHRYFG RKAE HLA-DQβ1 position 57 HLA-DRβ1 position 13 HLA-DRβ1 position 71 Case (filled bars) and control (unfilled bars) frequencies, as well as unadjusted univariate odds ratio estimates, are shown for each residue at HLA-DQβ1 position 57, HLA-DRβ1 position 13, and HLA-DRβ1 position 71. The number above the paired (case/control) bars is the unadjusted univariate odds ratio. The letter underneath each set of paired (case/control) bars is the single letter code for the amino acid residue: A, alanine; D, aspartic acid; E, glutamic acid; F, phenylalanine; G, glycine; H, histidine; K, lysine; R, arginine; S, serine; V, valine; Y, tyrosine. SOURCE: Reference 16, copyright © 2015 Nature Publishing Group, reprinted with permission
12–4 Genetics of Type 1 Diabetes amino acid residues at 399 positions FIGURE 12.4. Structural Model of HLA-DQ and HLA-DR Molecules for eight HLA genes were examined. As expected, the single most associ- ated risk variant was HLA-DQB1*03:02 that encodes an alanine at HLA-DQβ1 position 57, while the aspartic acid at this position is most protective. The second independent association is at HLA-DRβ1 position 13, where a histidine residue (tagging HLA-DR4) confers risk (arginine is protective), as does a serine residue that tags HLA-DR3 (tyrosine is protective). The HLA-DR 1 residue at position 71 is a β HLA-DQβ1 position 57, HLA-DRβ1 position 13, and HLA-DRβ1 position 71 are each located in the respective third independent association, with lysine molecule’s peptide-binding groove. HLA-DRβ1 positions 13 and 71 line the P4 pocket of the HLA-DR molecule. conferring strong risk (tagging HLA-DR3 SOURCE: Reference 16, copyright © 2015 Nature Publishing Group, reprinted with permission and HLA-DR4), while an alanine residue at this site is protective (Figure 12.3) (16). FIGURE 12.5. Phenotypic Variance in the HLA-DRB1–HLA-DQA1–HLA-DQB1 Locus
HLA-DRB1–HLA-DQA1–HLA-DQB1 29.6 The HLA-DQβ1 position 57, HLA-DRβ1 3 position 13, and HLA-DRβ1 position 71 HLA-DRβ1 position 71 are each located in the peptide-binding 25 (0.7%) groove of the HLA molecule (Figure 12.4) (16). HLA-DRβ1 positions 13 and 71 line HLA-DRβ1 2 position 13 the P4 pocket of HLA-DR. The HLA-DQβ1 (11.0%) position 57 alone accounts for ~15% of the total risk of type 1 diabetes, and 15 HLA-DRβ1 position 13 and HLA-DRβ1 Non-HLA position 71 account for an additional ( .23%) 1 HLA-DQ 1 12% of risk (16). Thus, together these Variance explained (%) β position 57 three amino acid positions capture ~27% (15.2%) of type 1 diabetes risk, or ~80% of the 5 PTPN22 HLA-DPB1 HLA-A HLA-B (0.78%) MHC-associated risk. The total type 1 (1. %) (1. 7%) (1.02%) INS diabetes risk explained by independent (3.3%) (additive) effects in the classical HLA genes is ~34% (Figure 12.5). Structurally, Assuming the liability threshold model and a global type 1 diabetes prevalence of 0.4%, all haplotypes in HLA-DRB1– the known association of HLA-DQβ1 HLA-DQA1–HLA-DQB1 together explain 29.6% of total phenotypic variance. HLA-DQβ1 position 57 alone explains 15.2% of the variance; the addition of HLA-DRβ1 position 13 and HLA-DRβ1 position 71 increases the explained position 57 would alter the HLA-DQ P9 proportion to 26.9%. Therefore, these three amino acid positions together capture over 90% of the signal within pocket; however, the novel associations HLA-DRB1–HLA-DQA1–HLA-DQB1. In contrast, variation in HLA-A, HLA-B, and HLA-DPB1 together explain approximately 4% of total variance. Genome-wide independently associated SNPs outside the HLA together explain with the HLA-DRβ1 position 13 and about 9% of variance; rs678 (in INS) and rs2476601 (in PTPN22) explain 3.3% and 0.78%, respectively. HLA, human HLA-DRβ1 position 71 sites would alter leukocyte antigen; SNP, single nucleotide polymorphism. the HLA-DR P4 pocket. This structural SOURCE: Reference 16, copyright © 2015 Nature Publishing Group, reprinted with permission site may be important in binding specific autoantigens that are associated with position 13 are the strongest contributors site, resulted in an additional 3% of risk of type 1 diabetes risk. to both additive and interactive effects type 1 diabetes. Together, the additive and on type 1 diabetes risk in the MHC (17). interaction effects of the classical HLA Although the influence of HLA class II These two strongest interacting positions sites and amino acid residues account genes on type 1 diabetes risk remains are in separate HLA molecules (HLA-DQ for over 90% of the MHC-type 1 diabetes unquestioned, significant residual for position 57 and HLA-DR for position 13). association. evidence of association with type 1 In a detailed interaction analysis of the diabetes can be detected in the MHC MHC, the effect of interaction within a region after controlling for the effects site (dominance) was shown to have a of HLA class II loci (16). Analysis of HLA small (~1%), but significant impact on the haplotypes suggest that non-additive contribution to risk of type 1 diabetes (17). effects are common within the MHC. Including interactions at the individual HLA-DQβ1 position 57 and HLA-DRβ1 haplotypes, rather than alleles at a given
12–5 DIABETES IN AMERICA, 3rd Edition
NON-MAJOR HISTOCOMPATIBILITY COMPLEX RISK LOCI
The fundamental process by which and the INS locus, but no compelling previously implicated in type 1 diabetes genetic variation is associated with type evidence for other susceptibility genes risk (23). Rare, inactivating alleles of 1 diabetes risk is through assessing DNA with similar large effects was observed IFIH1 are underrepresented among those sequence variation that differs in type 1 (19). These results suggested that addi- with type 1 diabetes, suggesting that diabetes cases and controls or through tional loci contributing as much as 50% of loss-of-function is protective from type 1 co-segregation with type 1 diabetes in the genetic risk have smaller effects, may diabetes. In this manner, GWAS results families. One such DNA variant occurs be common (allele frequency greater than can lead to discovery of genes that iden- when a single nucleotide in the genome 5%) in the population or private (unique) to tify novel biologic pathways and avenues differs in a population with greater than individual families, and require a different to novel therapeutic (or interventional) 5% frequency. This variant is then termed experimental approach for detection. drug targets to prevent or reduce the a single nucleotide polymorphism (SNP). burden of type 1 diabetes. There are millions of SNPs throughout the GENOME-WIDE ASSOCIATION SCANS human genome, roughly one SNP every With the improved genomic coverage The Wellcome Trust Case-Control 2,000 bases. through SNP discovery from the Consortium (WTCCC) conducted the first International HapMap Project (21) and extensive GWAS using the Affymetrix LINKAGE MAPPING decreasing costs of genotyping, the use GeneChip 500K Mapping Array Set in The primary focus of genome-wide of genome-wide association scans (GWAS) participants from the United Kingdom linkage scans in type 1 diabetes was to with robust statistical power is feasible. (24). Approximately 2,000 subjects for identify regions of the genome that exhib- The primary source of participants for each of seven diseases and a shared ited a significant deviation of “sharing” GWAS is unrelated type 1 diabetes cases set of approximately 3,000 controls (identity-by-descent, IBD) by sibling and controls, as the statistical method is were genotyped. Case-control analyses pairs from that expected for Mendelian association and not linkage. Through asso- identified seven significant, independent segregation of chromosomes (e.g., 25% ciation, the SNP that exhibits a difference associations with type 1 diabetes. Six sharing two alleles IBD, 50% sharing in frequency between cases and controls genes/regions had strong preexisting one allele IBD, 25% sharing no alleles (type 1 diabetes-associated variant) is statistical support for a role in type IBD). In type 1 diabetes, this process was thought to be in LD with the “true” type 1 1 diabetes susceptibility: MHC, INS, typically performed on multiple affected diabetes causal variant. However, CTLA4, PTPN22, IL2RA/CD25, and siblings in a family, with a large number family-based association analyses provide IFIH1. Regions on chromosomes 12q13, of families required to achieve robust important complementary results because 12q24, and 16p13 exhibited strong, yet statistical power given the expected size of their resistance to stratification bias, not statistically significant, evidence of risk in non-MHC genes (18,19). This caused (for example) when genetic differ- of association that warrant additional strategy effectively addressed an under- ences in cases and controls occur due to investigation. The WTCCC study demon- lying hypothesis of detecting a gene with sampling from populations of different strated the power of the GWAS approach variant alleles that have a large effect on ancestry, and are unrelated to disease. to identify novel loci associated with risk of type 1 diabetes. type 1 diabetes in an unbiased manner The first GWAS publications in type 1 and suggested that larger samples Unlike the MHC, for which a sample of diabetes focused on nonsynonymous with greater density of SNP genotyping 100 ASP families could detect significant SNPs (those that change the amino acid could uncover additional novel genes evidence of linkage, the estimated risks sequence of protein), with the assump- and pathways related to type 1 diabetes for other genes suggested a sample size tion that variants resulting in an amino risk. The 12q13 and 12q24 loci map to of as many as 4,000 ASP families might acid change would increase the likeli- regions containing several functional be required. The Type 1 Diabetes Genetics hood of detecting a functional change candidates, including ERBB3 (12q13) Consortium (T1DGC) was created to associated with type 1 diabetes risk (22). and SH2B3 (12q24). The 16p13 region assemble a collection of ASP families for From this design, a novel gene, IFIH1, contained only two genes of unknown the purpose of conducting genome-wide was identified as having multiple rare function, one of which (KIAA0350) was linkage scans for type 1 diabetes (20). protein coding variants associated with identified in a second GWAS 25( ) and as Although the resolution of the linkage type 1 diabetes. The function of IFIH1 a novel locus of susceptibility for other approach was limited to ~1 Mb regions, was not in the obvious immune pathway autoimmune diseases, consistent with the first major analysis of 2,658 ASPs of cytokines or beta cell proteins—IFIH1 pervasive sharing of genetic risk across confirmed the contribution of the HLA is a cytoplasmic sensor of viral RNA that immune-mediated disorders (26). region, insulin (INS) gene, and CTLA4 is part of the innate immune system and (18). With additional ASP family collection, the response to viral infection. Notably, The T1DGC expanded the findings from analysis of 4,422 ASP families continued IFIH1 is involved in the response to picor- the WTCCC by initiating a new GWAS to identify the contribution of the MHC naviruses, a family that includes viruses of 3,983 type 1 diabetes cases and
12–6 Genetics of Type 1 Diabetes
FIGURE 12.6. Progress Mapping Type 1 Diabetes Susceptibility Genes Based on Era of Genomic Search
7. 1 70–2000 2001–2006 2007–2008 200
6.5
6.
5.5 2.5 Odds ratio
2.
1.5
1. IL2 HLA IL1 IL27 IL7R 4p15 INS CD6 CCR5 CCR5 GAB3 GAB3 IL2RA IL2RA CTSH SIRPG SIRPG 14q22 IFIH1 IFIH1 GLIS3 GLIS3 RGS1 CTLA4 TAGAP CD226 COBL COBL PTPN2 BACH2 SKAP2 SH2B3 PTPN2 ERBB3 PRKD2 PTPN22 PRKCQ IL18RAP CIQTNF6 C1 or 5 CTRB1/2 UBASH3A UBASH3A TNFAIP3 TNFAIP3 C6or 173 ORMDL3 ORMDL3 CLEC16A CLEC16A SMARCE1 SMARCE1 C14or 181 Locus
SOURCE: Reference 26
3,999 controls, followed by a combined reflecting the larger effects identified members express disease; statistical meta-analysis with previously published only with smaller sample sizes. These evidence for replication in families was studies that added a further 3,715 cases two regions account for over 50% of the not reached for 11 of the 18 loci. and 5,069 controls, with information genetic variation in type 1 diabetes risk from more than 841,622 SNPs. From the and were identified as the result of small The T1DGC further examined the 18 meta-analysis of 7,514 cases and 9,045 case-control and family studies. In 2001– novel loci by expansion of the family controls with confirmation in 4,267 cases, 2006, with the advent of large-scale sample to account for potential popu- 4,670 controls, and 2,319 ASP families, genotyping arrays and assembly of larger lation stratification and replication 28 novel regions provided initial signifi- case-control series, several additional genotyping (27). The analyses of these cant evidence for association to type 1 loci were identified (including PTPN22, 2,322 additional families, combined with diabetes risk (26). Eighteen novel regions CTLA4, and IFIH1) that slightly increased the original 2,319 families, provided were strongly replicated from the GWAS the amount of genetic risk identified. improved protection from population meta-analysis. In 2007–2008, the first well-powered stratification bias and increased power GWAS by the WTCCC greatly increased to provide further replication support for In total, the number of loci contributing the number of loci but was of modest the associations of the 18 novel suscep- significantly to risk of type 1 diabetes (e.g., IL2RA) to small contribution to risk. tibility loci. With this larger sample of risk had now reached 42, although In 2009, the T1DGC GWAS meta-analysis families, only one (C14orf181/14q24.1) each locus varied in size of the interval, (26) nearly doubled the number of loci, of the 18 novel loci failed to reach statis- the number of candidate genes in the (with 8,000 cases and 9,000 controls tical significance. Further, all of the novel locus, the size of the effect on type 1 plus replication) with each having small type 1 diabetes risk loci had consistent diabetes risk for the most associated effect. Of interest, the meta-analysis direction of effects (e.g., risk or protec- SNP, and the strength of the evidence confirmation in the T1DGC ASP families tive for type 1 diabetes risk) with the supporting the most likely candidate was consistent in direction of the risk original GWAS meta-analysis (26,27) (Figure 12.6). This figure demonstrates allele effect, but the size of the effect with no evidence of heterogeneity in the several points. The first is that the genes was consistently less in the families than disease associations across family collec- (loci) of largest impact on type 1 diabetes in the case-control comparison, perhaps tions, despite there being significant risk were identified in the 30-year period indicating the greater role of the MHC in SNP genotype frequency differences. from early 1970s (HLA) to 2000 (INS), type 1 diabetes risk when multiple family After unequivocal replication of type 1
12–7 DIABETES IN AMERICA, 3rd Edition
FIGURE 12.7. Gene Content of Type 1 Diabetes Susceptibility Loci Identified by Genome-Wide Association Study Meta-Analysis
3
25
2
15
1 Number of genes 5