Page 1 of 47 Diabetes

Targeted Deep Sequencing in Multiple-Affected Sibships of European Ancestry Identifies

Rare Deleterious Variants in PTPN22 that Confer Risk for Type 1 Diabetes

Short title: Rare PTPN22 Variants Contributing to T1D

Yan Ge,1 Suna OnengutGumuscu,2,3 Aaron R. Quinlan,4,5 Aaron J. Mackey,2,3 Jocyndra A.

Wright,1 Jane H. Buckner,6 Tania Habib,6 Stephen S. Rich,2,3 Patrick Concannon1,7

1Department of Pathology, Immunology, and Laboratory Medicine, University of Florida,

Gainesville, FL

2Center for Public Health Genomics, University of Virginia, Charlottesville, VA

3Department of Public Health Sciences, University of Virginia, Charlottesville, VA

4Department of Human Genetics, University of Utah, Salt Lake City, UT

5Department of Biomedical Informatics, University of Utah, Salt Lake City, UT

6Translational Research Program, Benaroya Research Institute at Virginia Mason, Seattle, WA

7Genetics Institute, University of Florida, Gainesville, FL

Corresponding author:

Name: Patrick Concannon

Address: University of Florida Genetics Institute, 2033 Mowry Road, CGRC Room 115, Box

103610, Gainesville, FL 32610

Tel: (352) 2945735

Fax: (352) 2738284

1

Diabetes Publish Ahead of Print, published online December 2, 2015 Diabetes Page 2 of 47

Email: [email protected]

Word count: 3,932

2 tables, 4 figures, 3 supplementary tables, and 2 supplementary figures

2 Page 3 of 47 Diabetes

ABSTRACT

Despite the finding of over 40 risk loci for type 1 diabetes (T1D), the causative variants and

remain largely unknown. Here, we sought to identify rare deleterious variants of moderate

tolarge effects contributing to T1D. We deeply sequenced 301 coding genes located in

49 previously reported T1D risk loci in 70 T1D cases of European ancestry. These cases were

selected from putatively highrisk families that had three or more siblings diagnosed with T1D at

early ages. A cluster of rare deleterious variants in PTPN22 was identified, including two novel

frameshift mutations (ss538819444 and rs371865329) and two missense variants (rs74163663

and rs56048322). Genotyping in 3,609 T1D families showed that rs56048322 was significantly

associated with T1D, and that this association was independent of the T1Dassociated, common

variant rs2476601. The risk allele at rs56048322 affects splicing of PTPN22 resulting in the

production of two alternative PTPN22 transcripts and a novel isoform of LYP (the protein

encoded by PTPN22). This isoform competes with the wildtype LYP for binding to CSK and

results in hyporesponsiveness of CD4+ T cells to antigen stimulation in T1D subjects. These

findings demonstrate that in addition to common variants, rare deleterious variants in PTPN22

exist and can affect T1D risk.

3 Diabetes Page 4 of 47

INTRODUCTION

Type 1 diabetes (T1D) results from the autoimmune destruction of insulinproducing β cells in the pancreatic islets, culminating in complete insulin deficiency and lifelong reliance on administration of exogenous insulin. Although there exist rare monogenic forms of T1D (1,2), the common form is regarded as genetically complex, arising from the actions, and possible interactions, of multiple genetic and environmental risk factors. The human leukocyte antigen

(HLA) region on 6p21 accounts for approximately half of the genetic risk for T1D

(3,4). Genomewide association studies (GWAS) and metaanalyses have identified over 40 additional, nonHLA T1D risk loci (5–13), most of which have modest individual effects on disease risk (14).

Despite the identification of numerous T1Dassociated genomic regions from GWAS, the specific causative variants for T1D located in these, and other regions of the genome, as well as the genes they impact, remain largely unknown. Linkage disequilibrium within disease associated regions limits the power to pinpoint, by association mapping, the causative variants among a number of tightly correlated, and common, alleles at different polymorphic sites.

Additional difficulties are posed by associations detected at sites that are neither coding nor obviously regulatory and, therefore, without predicted biological functions.

One solution to these challenges may lie in the identification of infrequent or rare protein

altering variants, which might be expected to cluster in the or genes relevant to pathogenesis in a diseaseassociated region and, having more readily discernible effects on gene

function, could provide insights into the role of the gene in pathogenesis. In a prior study of a

subset of T1Dassociated loci, the exons and splice sites of 10 candidate genes in T1D cases and

healthy controls were deeply sequenced (15). Four infrequent variants (minor allele frequency,

4 Page 5 of 47 Diabetes

MAF, <3%) in IFIH1 were found to be independently associated with lower risk for T1D,

providing important clues as to the role of IFIH1 in T1D pathogenesis.

In the current study, we searched for rare deleterious variants of moderatetohigh

penetrance by deep sequencing of the coding regions of 301 genes in 49 nonHLA T1D risk loci

in T1D cases from families with three or more siblings diagnosed with T1D at early ages. Such

families are rare and putatively at high risk, given that the risk to siblings of a T1D case is

approximately 6% (16). We identified a cluster of rare deleterious variants in PTPN22 and found

that one of them, rs56048322, affected the splicing of PTPN22 transcripts and the function of

PTPN22 in T cells.

5 Diabetes Page 6 of 47

RESEARCH DESIGN AND METHODS

Human Subjects

Genomic DNA samples from T1Daffected sib pairs (ASP) and trio families were obtained from two sources: the Human Biological Data Interchange (HBDI) (375 ASP families, n = 1,840) (17,18) and the Type 1 Diabetes Genetics Consortium (T1DGC) (3,234 ASP/trio families, n = 12,424) (19). DNA and peripheral blood mononuclear cells (PBMC) were obtained from the Benaroya Research Institute Immune Mediated Disease Registry for Translational

Research. Research protocols were approved by the Benaroya Research Institute Institutional

Review Board.

Selection Criteria

Potential highrisk T1D families were identified based on the following criteria: (1) the parents of the T1D cases were of European ancestry and had three or more children diagnosed with T1D at or after one year of age (to exclude neonatal diabetes) and prior to 21 years; and (2) neither of the parents had T1D (to reduce the risk of selecting subjects with dominantly inherited maturity onset diabetes of the young, MODY).

Targeted Deep Sequencing and Bioinformatic Analyses

For sequencing, ten genomic DNA samples were combined into each of 7 pools, and libraries were constructed using the PairedEnd Sample Preparation kit (Illumina). Target regions were enriched using the SureSelect PairedEnd Target Enrichment System (Agilent

Technologies). Seven targetenriched libraries were sequenced on an Illumina Genome Analyzer

II, and 63bp pairedend reads were generated.

6 Page 7 of 47 Diabetes

The sequencing reads were aligned to the UCSC hg18 human reference genome with the

BurrowsWheeler Aligner (version 0.5.9) (20). PCR duplicate reads were removed using

SAMtools, and one sorted Binary Alignment/Map (BAM) file was generated and indexed for

each sequencing library (21). Singlenucleotide polymorphisms (SNPs) and indels were

identified using FreeBayes (version 0.5.1) and the Genome Analysis Toolkit (GATK, version

2.49) (22), respectively. Sequencing depths were calculated using the Picard package (version

1.87). Variants were annotated using the Variant Effect Predictor (VEP, version 71) (23). The

impacts of missense variants on protein functions were predicted using SIFT (version 5.0.2) (24)

and PolyPhen (version 2.2.2) (25). Variants with the following consequences were considered

deleterious: frameshift, inframe insertion/deletion, stop codon lost or gained, predicted

damaging missense by both SIFT and PolyPhen, or occurring in a splice donor or acceptor

region. A variant was regarded as rare if its MAF was ≤0.00036 in European populations; this

corresponded to a probability of 0.05 of detecting a variant allele in 70 randomly sampled

individuals.

Genotyping

The MGBEclipse system (ELITechGroup, EPOCH BIOSCIENCES) was used to

genotype 3,609 T1D ASP/trio families consisting of 14,264 participants collected by the HBDI

and the T1DGC, according to the manufacturer’s protocol.

Cell Isolation, Stimulation, and Culture

Primary B cells, T cells, and monocytes were purified from PBMC by positive selection

with CD19, CD3, and CD14 magnetic beads, respectively (Miltenyi). For quantitative PCR,

CD4+ T cells were purified from PBMC by negative selection using the human CD4+ T cell

7 Diabetes Page 8 of 47

isolation kit (Miltenyi), and were stimulated with 2.5 ug/ml platebound antiCD3 antibody

(clone OKT3) and 0.25 ug/ml soluble antiCD28 antibody for 0, 6, 18, and 24 hours. Jurkat cells

and Blymphoblastoid cell lines (BLCLs) were cultured in RPMI 1640 medium supplemented

with 10% FBS (Thermo Scientific HyClone), 2.05 mM Lglutamine, 100 units/ml penicillin, 100

o ug/ml streptomycin, and 1 mM sodium pyruvate (Life Technologies) at 37 C with 5% CO2.

Quantitative RT-PCR

Total RNA was extracted using the RNeasy Mini kit (QIAGEN), and contaminating genomic DNA was removed with Ambion DNAfreeTM DNase Treatment and Removal

Reagents (Life Technologies). Firststrand cDNA was synthesized using random hexamer primers and Superscript II Reverse Transcriptase (Life Technologies). PCRs containing SYBR

Green I and ROX reference dye were performed on an ABI 7900HT Fast RealTime PCR

System (Life Technologies). All samples were tested in triplicate. Relative levels were calculated using the comparative CT method (26), where CT = mean CT (PTPN22) –

mean CT (GAPDH). The primers used for the assay are provided in Supplementary Table 1. The

heat map in Fig. 1C was generated using the 2CT values and the R packages, gplots and

RColorBrewer.

Co-immunoprecipitation and Immunoblotting

Cells were lysed in EBC lysis buffer (50 mM TrisHCl, pH 7.5, 120 mM NaCl, 0.5% NP

40, and 1 mM EDTA) containing protease and phosphatase inhibitors (Complete Mini and

PhosSTOP, Roche). One milligram of lysate in 1 ml EBC lysis buffer was precleared with

GammaBind Plus Sepharose beads (GE Healthcare) for 2 h at 4oC and then incubated overnight at 4oC with 0.6 g rabbit antiCSK antibody (Cell Signaling TECHNOLOGY, #4980) or 0.6 g

8 Page 9 of 47 Diabetes

normal rabbit IgG (SouthernBiotech). Immune complexes were captured by incubating the lysate

with 20 l GammaBind Plus Sepharose beads for 2 h at 4oC. The beads were washed four times

with 1 ml EBC lysis buffer containing 1 mM PMSF and 1 mM sodium orthovanadate. The

immunoprecipitates were eluted by heating the beads for 5 min at 95oC in 25 l LDS sample

buffer (Life Technologies) containing 50 mM DTT.

Wholecell lysates and immunoprecipitates were resolved on 7% NuPAGE Novex Tris

Acetate gels (Life Technologies). Immunoblotting was performed using antibodies to LYP

(R&D Systems, AF3428, 0.3 g/ml), CSK (Abnova, H00001445M01, 1 g/ml), and γTubulin

(SigmaAldrich, T6557, 1:10,000 dilution).

Calcium Flux Measurement

Previously frozen PBMC were incubated with the cellpermeant dye, Indo1 AM

(Molecular Probes), for 45 minutes at 37°C, and then surfacestained with antiCD8PE/Cy5

(BioLegend), CD27APC (BD Biosciences), CD45ROPE (BD Biosciences), CD45RAFITC

(BioLegend), CD4PerCP/Cy5.5 (BioLegend), and CD19PE/Cy7 (BioLegend). The indo1

fluorescence ratio was acquired as a function of time on an LSRII flow cytometer, and kinetics

curves were generated using the FlowJo software. Collection of a 30second or 1minute baseline

was followed by stimulation with 10 g/ml soluble antiCD3 (clone UCHT1, BioLegend), 20

g/ml antiκ + antiλ F(ab’)2 (Southern Biotech), or 1 g/ml ionomycin. All the calcium flux

profiles were generated using a standard protocol.

Statistical Analyses

The FamilyBased Association Test (FBAT) program (version 2.0.4) was used for single

marker association tests and haplotype analyses, as well as for checking Mendelian inconsistency

9 Diabetes Page 10 of 47

in genotyping (27). Minor allele frequencies were estimated using PLINK (v.1.07) (28).

Conditional association analyses were conducted using the UNPHASED program (version 3.1)

(29).

10 Page 11 of 47 Diabetes

RESULTS

Identification of Rare Deleterious Variants by Targeted Sequencing of T1D Cases from

Multiple-Affected Families of European Ancestry

We applied our selection criteria for putatively highrisk families (see Research Design

and Methods) and chose 70 T1D cases of European ancestry from the T1Daffected families

collected by the T1DGC. In these 70 cases, the coding regions of all 301 known proteincoding

genes located in 49 T1Dassociated chromosomal regions, excluding the HLA region, were

captured (1.12 Mb per subject) (Supplementary Table 2) and sequenced in pools of 10 subjects,

each.

For each pool, on average, 97.64% of the target regions were covered by at least two

reads (ranging from 97.56% to 97.74%), and 93.4% of the target bases were covered at ≥30x

(ranging from 92.9% to 94.0%). All pools had mean target coverage of >660x (ranging from

661x to 1018x). Fiftyone genes were identified that harbored two or more rare (MAF ≤0.00036)

predicted deleterious variants. In the T1Dassociated region on chromosome 1p13.2, 10 genes

were sequenced; only two rare (MAF ≤0.00036) predicted deleterious variants were identified,

and both occurred in PTPN22—a 4bp deletion in exon 11 (ss538819444) and a 1bp deletion in

exon 13 (rs371865329). Two infrequent (MAF <0.01) coding missense variants with predicted

deleterious effects, rs74163663 (p.Arg748Gly) and rs56048322 (p.Lys750Asn), were also

identified, and, similarly, clustered in PTPN22. A simulation in which 10,000 sets of 70 T1D

cases were randomly selected from the initial study population showed that the probability of

identifying all four of the deleterious variants seen in PTPN22 was 1.3 x 104, suggesting that the

criteria used to define highrisk T1D families had effectively enriched for infrequent variants

11 Diabetes Page 12 of 47

with predicted deleterious effects in PTPN22. Of note, the selection criteria did not enrich for the

most common genetic risk variants for T1D, which are highrisk HLA DRB1DRA1DQB1

haplotypes (Supplementary Table 3).

PTPN22 is an established risk locus for T1D and several other autoimmune disorders

(30–34). Risk at PTPN22 is associated with the minor allele at a common nonsynonymous

coding region SNP, rs2476601, which results in the missense substitution R620W.

The Rare Allele of rs56048322 in PTPN22 Has an Independent Effect on T1D Risk

We genotyped ss538819444, rs371865329, rs74163663 and rs56048322 in 14,264

individuals from 3,609 T1Daffected ASP/trio families. Singlemarker familybased association

tests (27), were performed to examine association between variants in PTPN22 and T1D.

Positive and negative Z scores of FBAT indicate risk and protection, respectively. The single

marker FBAT showed that the minor alleles of rs56048322 (P = 0.0014), rs74163663 (P =

0.028), and rs2476601 (P = 8.62 x 1036) were each associated with T1D (Table 1). Comparable results were obtained assessing sharing of alleles at these markers among ASPs by the transmissiondisequilibrium test (TDT). No association with T1D was detected for ss538819444 or rs371865329, but their rarity in the population (each occurred in fewer than three families) limited power to assess their effects on T1D risk.

Six haplotypes for ss538819444, rs371865329, rs74163663, rs56048322, and the common T1D risk variant, rs2476601, were identified in the T1D ASP/trio families (Table 2).

(Differences in the numbers of informative families shown in Tables 1 and 2 are due to missing genotypes.) The H2 haplotype carrying the minor 1858T risk allele at rs2476601 was significantly associated with T1D (P = 1.81 x 1034), as expected. The H3 haplotype, which

12 Page 13 of 47 Diabetes

carried the 2250C risk allele at rs56048322, was also associated with T1D (P = 0.0014) in the

absence of the common risk allele at rs2476601 (Table 2). The minor allele at rs56048322 still

exhibited a significant association with T1D (P = 0.0003) after conditioning on rs2476601. The

minor allele of rs74163663 cooccurred with the 1858T risk allele at rs2476601 (Table 2), and

showed a marginally significant association with T1D (P = 0.06) after conditioning on

rs2476601.

The Minor Allele of rs56048322 Is Associated with the Production of Alternative PTPN22

Transcripts

The SNP rs56048322 alters the last nucleotide of exon 18 of PTPN22. In a prior report,

PBMC from one individual heterozygous at rs56048322 produced an alternative PTPN22

transcript in which exon 17 was spliced to exon 19 creating a premature stop codon at the exon

17to19 junction (Fig. 1A) (35). In BLCLs derived from subjects heterozygous at rs56048322,

both this previously reported alternative transcript and a second more common alternative

PTPN22 transcript, resulting from runon transcription into intron 18, were detected (Fig. 1C).

These alternative transcripts and the fulllength PTPN22 transcript were quantified in primary T

cells, B cells, and monocytes from five subjects, including one healthy control, two rheumatoid

arthritis (RA) and two relapsing polychondritis (RP) cases (Fig. 1B). The use of samples from

subjects with autoimmune disorders other than T1D were necessitated by both the rarity of

carriers of the minor allele at rs56048322 and the availability of comparably treated, viably

frozen PBMCs from these subjects. As shown in Fig. 1C, the presence of the risk allele was

associated with a substantial enrichment for the two alternative PTPN22 transcripts in all three

primary cell types and in BLCLs derived from T1D families (Fig. 1C). The average relative

expression level of the intron18runon transcript in the heterozygous BLCLs was 72 times as

13 Diabetes Page 14 of 47

high as that in the BLCLs homozygous for the major allele (Fig. 1C). The exon18skipping

transcript was detected only at low levels in individuals heterozygous at rs56048322 (Fig. 1C). In

B1.1 and B1.2, which were heterozygous for c.878_881delAGAT, transcripts carrying the 4bp

deletion were detected by Sanger sequencing of PCR products generated from cDNA, but the

relative expression level of fulllength PTPN22 was not evidently lower in these cells than in the

wildtype BLCLs (Fig. 1C).

In human primary CD4+ T cells stimulated with antiCD3 and antiCD28 antibodies, the amounts of the fulllength PTPN22 transcript peaked at 6 hours post stimulation and then decreased to below basal levels by 24 hours, regardless of the genotype at rs56048322 (Fig. 2A,

2B). The intron18runon PTPN22 transcript was expressed at low levels in CD4+ T cells from subjects homozygous for the major allele of rs56048322, and the expression levels decreased upon stimulation (Fig. 2A, 2C). In contrast, in CD4+ T cells from carriers of the risk allele of

rs56048322, the amounts of the intron18runon transcript, which were markedly higher than

those in the noncarriers, increased upon stimulation and peaked at 6 hours (Fig. 2A, 2C) as the

fulllength transcript did. The exon18skipping PTPN22 transcript was detected only in carriers

of the risk allele of rs56048322, and the expression levels decreased upon stimulation (Fig. 2A,

2D).

The Rare Allele of rs56048322 Produces a Novel LYP Isoform that Can Bind to CSK

PTPN22 encodes the intracellular lymphoid protein tyrosine phosphatase (LYP) which

regulates signaling through the T and B cell receptors. LYP forms a complex with the Cterminal

Src kinase (CSK), a negative regulator of T cell receptor (TCR)mediated signaling. This

complex of LYP and CSK is reported to exert synergistic inhibition on TCR signaling (36). LYP

14 Page 15 of 47 Diabetes

has four prolinerich motifs (P1P4) at the Cterminus, of which the P1 motif interacts with the

SH3 domain of CSK (37,38). The tryptophan residue introduced at position 620 by the risk allele

of rs2476601 is located in the P1 motif and decreases the binding of LYP to CSK (30). It has

been proposed that this effect on the interaction between LYP and CSK is responsible for the risk

for autoimmunity associated with the risk allele at rs2476601. In wholecell lysates derived from

BLCLs heterozygous at rs56048322, two LYP isoforms were detected, one corresponding to the

predicted size for fulllength LYP (91.7 kDa) and the second to the predicted size for the intron

18runon product (87.2 kDA). Both isoforms could be coimmunoprecipitated with CSK at

comparable levels (Fig. 3B), indicating comparable ability to bind to CSK.

The Rare Allele of rs56048322 Impairs TCR-Mediated Calcium Flux in CD4+ T Cells

Although the association of the minor allele at rs2476601 with autoimmunity is well

established, there are conflicting reports as to whether this allele results in loss or gain of

function of LYP (30,39–43). The availability of a second independent risk allele affecting the

coding region of PTPN22 provided an opportunity to further explore the mechanism whereby

PTPN22 mediates risk for autoimmunity. To determine the effects of the independent T1D risk

allele of rs56048322, intracellular calcium flux induced by TCR crosslinking in CD4+ T cells

from the peripheral blood of a healthy subject and agematched cases with T1D or RA was

compared (Fig. 4). All subjects were homozygous for the nonrisk 1858C allele at rs2476601.

Compared to the cases homozygous for the major allele of rs56048322, CD4+ T cells from cases

heterozygous at rs56048322 exhibited decreased TCRmediated intracellular calcium flux (Fig.

4), consistent with a gain of function. However, the risk allele of rs56048322 did not impair

calcium flux in TCRstimulated CD4+ T cells from nonautoimmune control subjects

(Supplementary Fig. 1), nor in human primary B cells stimulated via B cell receptor (BCR)

15 Diabetes Page 16 of 47

(Supplementary Fig. 2).

16 Page 17 of 47 Diabetes

DISCUSSION

In this study, we searched for rare risk variants of moderatetolarge effect sizes in

chromosomal regions implicated as harboring risk loci for T1D in prior GWAS. We

implemented a familybased approach, based on the hypothesis that T1D families having

multiple affected siblings with early ages at onset would be enriched for risk variants with large

effect sizes which might be expected to be rare in the population. We focused on rare variants

that had substantial predicted impacts on the encoded , including proteinaltering indels,

stop gain/loss variants, and splice variants, with the expectation that their distribution among

genes in a region might identify the relevant causative gene and their predicted deleterious

effects provide insight into the mechanism whereby the implicated gene might act to confer risk

for T1D. Multiple novel deleterious variants in different T1D risk regions were identified by this

approach. Functional studies were performed for one gene, PTPN22, where the enrichment for

rare indels and predicted deleterious missense variants supported its role as the causative gene in

the region and provided an opportunity to explore the mechanism of its effect on T1D risk.

Conflicting results have been reported as to whether the risk allele of rs2476601 in

PTPN22 which encodes an R to W amino acid substitution at position 620 results in loss (30,39)

or gain (40–43) of function. Two of the deleterious variants identified by sequencing of PTPN22

were predicted to result in premature termination, consistent with a possible loss of function, but

had MAFs of <0.0002, limiting the ability to carry out functional studies. In B lymphoblastoid

cells from carriers of these variants, no alternative PTPN22 isoforms were observed and the

wildtype isoform occurred at normal levels. A rare, predicted deleterious missense variant,

rs74163663, was detected but only on haplotypes also containing the common risk allele at

rs2476601. The minor allele of the remaining rare deleterious variants identified in PTPN22,

17 Diabetes Page 18 of 47

rs56048322, was associated with increased T1D risk independent of the common T1Dassociated

risk allele of rs2476601 (Table 2). The risk allele of rs56048322 interfered with the correct

splicing of PTPN22 resulting in production of two alternative PTPN22 transcripts, one of which was degraded upon T cell activation, while the other generated significant amounts of a lower molecularweight Cterminally truncated LYP isoform. Unlike the product of the risk allele at rs2476601 (30), this LYP isoform bound CSK through its P1 domain but lacked the Cterminal

P3 and P4 proline rich domains which likely mediate additional, as yet uncharacterized, protein protein interactions. Despite the ability of its protein product to bind CSK, the rs56048322 risk allele, similar to the risk allele at rs2476601, resulted in blunted TCRmediated calcium flux in

CD4+ T cells, a phenotype most pronounced in the memory CD4+ T cell compartment (data not shown) (41). Unlike carriers of the rs2476601 risk allele, carriers of the risk allele at rs56048322 displayed more variability in the degree of T cell hyporeactivity, and this phenotype was only observed in subjects with existing autoimmunity. No effect of rs56048322 was observed on

BCRmediated calcium flux in CD19+ B cells (Supplementary Fig. 2), and further investigation

will be required to determine whether rs56048322 affects signaling in specific subtypes of B

cells as rs2476601 does (41). It should be noted that carriers of the rs56048322 risk allele are

rare. Only 6 carriers were available for the functional studies carried out here; a comprehensive

description of immune phenotypes associated with this variant will require further ascertainment

and study. Nevertheless, the findings presented here suggest that the rare risk allele at

rs56048322 and the more common risk allele at rs2476601 exert similar effects on T cell

activation and function, but likely do so through different mechanisms. We propose that both act by dominant interference, producing a protein product either lacking or defective in selected proline rich domains but differing in which domains are affected and, presumably, what

18 Page 19 of 47 Diabetes

interactions are disrupted. The evidence from these two, independently diseaseassociated

genetic variants, acting by distinct mechanisms but resulting in a common phenotype, suggests

that risk alleles at PTPN22 contribute to T1D susceptibility via blunted TCR signaling that may

be due to either direct effects of the risk variants or compensatory alterations in this pathway in

response to these variants.

In addition to providing insight into the mechanism of action of PTPN22 in

autoimmunity, the enrichment we observed for rare deleterious variants in this single gene in the

1p13.2 region raises the possibility that resequencing in highrisk families, as defined here, could

be used to identify causative genes in other regions delineated by GWAS, or more broadly to

identify novel risk loci. Although we did identify additional deleterious variants in other T1D

associated regions in our study, the small number of the families studied (n = 70) limited our

power to detect any enrichment for deleterious variants in single genes at additional sites.

Certainly, the use of highrisk families has been instrumental in identifying highly penetrant risk

variants in various cancers. Our success in the current study with PTPN22 encourages the

application of this approach on a larger scale in autoimmunity. Of note, the T1DGC family

collection, which was ascertained based on the presence of at least two affected siblings, is

already enriched for highrisk HLA DRDQ haplotypes; our selection criteria for highrisk

families, which require one additional affected sibling, did not further increase the frequencies of

highrisk HLA DRDQ haplotypes (Supplementary Table 3) but were effective at identifying

lowfrequency deleterious variants of T1Dassociated loci with lower overall risk than the HLA

locus. While the current study utilized a targeted approach focusing on chromosomal regions

previously implicated by GWAS, an exome sequencing approach would be more efficient and

offer the possibility of identifying novel risk loci not previously identified by GWAS. In the case

19 Diabetes Page 20 of 47

of multiple sclerosis (MS), exome sequencing was recently performed in 43 affected individuals

selected from families with four or more MSaffected members in more than one generation.

One novel lossoffunction variant was identified in CYP27B1 in one of the cases, and the variant was significantly associated with MS, indicating a causative role for CYP27B1 in MS (44).

Results such as these, and the findings from the current study, support the examination of the human exome in families with high disease burdens as an approach for the identification of causative genes and variants for common, autoimmune disorders.

20 Page 21 of 47 Diabetes

Acknowledgments. This research utilizes resources and data provided by the Type 1 Diabetes

Genetics Consortium, the NHLBI GO Exome Sequencing Project, and the ImmunoChip project.

We thank investigators and staff at the Benaroya Research Institute at Virginia Mason for subject

recruitment and sample processing.

Funding. This work was supported by National Institutes for Health Grants DK046635 and

DK085678 to P.C. and DK097672 to J.H.B.

Duality of Interest. No potential conflicts of interest relevant to this article were reported.

Author Contributions. Y.G., S.O., and P.C. designed the study. Y.G., S.O., A.J.M., J.A.W., and

T.H. performed research. J.H.B. and S.S.R. contributed new reagents/analytic tools. Y.G., S.O.,

and A.R.Q. analyzed data. Y.G. and P.C. wrote the paper. P.C. is the guarantor of this work and,

as such, had full access to all the data in the study and takes responsibility for the integrity of the

data and the accuracy of the data analysis.

21 Diabetes Page 22 of 47

References

1. Nagamine K, Peterson P, Scott HS, Kudoh J, Minoshima S, Heino M, et al. Positional cloning of the APECED gene. Nat Genet. 1997;17(4):393–8.

2. Bennett CL, Christie J, Ramsdell F, Brunkow ME, Ferguson PJ, Whitesell L, et al. The immune dysregulation, polyendocrinopathy, enteropathy, Xlinked syndrome (IPEX) is caused by mutations of FOXP3. Nat Genet. 2001;27(1):20.

3. Risch N. Assessing the role of HLAlinked and unlinked determinants of disease. Am J Hum Genet. 1987;40(1):1.

4. Noble JA, Valdes AM, Cook M, Klitz W, Thomson G, Erlich HA. The role of HLA class II genes in insulindependent diabetes mellitus: molecular analysis of 180 Caucasian, multiplex families. Am J Hum Genet. 1996;59(5):1134.

5. Todd JA, Walker NM, Cooper JD, Smyth DJ, Downes K, Plagnol V, et al. Robust associations of four new chromosome regions from genomewide analyses of type 1 diabetes. Nat Genet. 2007;39(7):857–64.

6. Burton PR, Clayton DG, Cardon LR, Craddock N, Deloukas P, Duncanson A, et al. Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat Genet. 2007;39(11):1329–37.

7. Hakonarson H, Grant SF, Bradfield JP, Marchand L, Kim CE, Glessner JT, et al. A genomewide association study identifies KIAA0350 as a type 1 diabetes gene. Nature. 2007;448(7153):591–4.

8. Burton PR, Clayton DG, Cardon LR, Craddock N, Deloukas P, Duncanson A, et al. Genomewide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–78.

9. Hakonarson H, Qu HQ, Bradfield JP, Marchand L, Kim CE, Glessner JT, et al. A novel susceptibility locus for type 1 diabetes on Chr12q13 identified by a genomewide association study. Diabetes. 2008;57(4):1143–6.

10. Cooper JD, Smyth DJ, Smiles AM, Plagnol V, Walker NM, Allen JE, et al. Metaanalysis of genomewide association study data identifies additional type 1 diabetes risk loci. Nat Genet. 2008;40(12):1399–401.

11. Grant SF, Qu HQ, Bradfield JP, Marchand L, Kim CE, Glessner JT, et al. Followup analysis of genomewide association data identifies novel loci for type 1 diabetes. Diabetes. 2009;58(1):290–5.

22 Page 23 of 47 Diabetes

12. Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD, Erlich HA, et al. Genome wide association study and metaanalysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet. 2009;41(6):703–7.

13. Bradfield JP, Qu HQ, Wang K, Zhang H, Sleiman PM, Kim CE, et al. A genomewide metaanalysis of six type 1 diabetes cohorts identifies multiple associated loci. PLoS Genet. 2011;7(9):e1002293.

14. Concannon P, Rich SS, Nepom GT. Genetics of type 1A diabetes. N Engl J Med. 2009;360(16):1646–54.

15. Nejentsev S, Walker N, Riches D, Egholm M, Todd JA. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science. 2009;324(5925):387– 9.

16. Spielman RS, Baker L, Zmijewski C. Gene dosage and susceptibility to insulin dependent diabetes. Ann Hum Genet. 1980;44(2):135–50.

17. Lernmark A, Ducat L, Eisenbarth G, Ott J, Permutt MA, Rubenstein P, et al. Family cell lines available for research. Am J Hum Genet. 1990;47(6):1028.

18. Cox NJ, Wapelhorst B, Morrison VA, Johnson L, Pinchuk L, Spielman RS, et al. Seven regions of the genome show evidence of linkage to type 1 diabetes in a consensus analysis of 767 multiplex families. Am J Hum Genet. 2001;69(4):820–30.

19. Rich SS, Akolkar B, Concannon P, Erlich H, Hilner JE, Julier C, et al. Overview of the Type I Diabetes Genetics Consortium. Genes Immun. 2009;10(S1):S1–4.

20. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25(14):1754–60.

21. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.

22. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing nextgeneration DNA sequencing data. Genome Res. 2010;20(9):1297–303.

23. McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics. 2010;26(16):2069–70.

24. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31(13):3812–4.

23 Diabetes Page 24 of 47

25. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9.

26. Livak KJ, Schmittgen TD. Analysis of Relative Gene Expression Data Using RealTime Quantitative PCR and the 2 CT Method. methods. 2001;25(4):402–8.

27. Laird NM, Horvath S, Xu X. Implementing a unified approach to familybased tests of association. Genet Epidemiol. 2000;19(S1):S36–42.

28. Purcell S, Neale B, ToddBrown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for wholegenome association and populationbased linkage analyses. Am J Hum Genet. 2007;81(3):559–75.

29. Dudbridge F. Likelihoodbased association analysis for nuclear families and unrelated subjects with missing genotype data. Hum Hered. 2008;66(2):87–98.

30. Bottini N, Musumeci L, Alonso A, Rahmouni S, Nika K, Rostamkhani M, et al. A functional variant of lymphoid tyrosine phosphatase is associated with type I diabetes. Nat Genet. 2004;36(4):337–8.

31. Begovich AB, Carlton VE, Honigberg LA, Schrodi SJ, Chokkalingam AP, Alexander HC, et al. A Missense SingleNucleotide Polymorphism in a Gene Encoding a Protein Tyrosine Phosphatase (PTPN22) Is Associated with Rheumatoid Arthritis. Am J Hum Genet. 2004;75(2):330–7.

32. Kyogoku C, Ortmann WA, Lee A, Selby S, Carlton VE, Chang M, et al. Genetic association of the R620W polymorphism of protein tyrosine phosphatase PTPN22 with human SLE. Am J Hum Genet. 2004;75(3):504–7.

33. Velaga MR, Wilson V, Jennings CE, Owen CJ, Herington S, Donaldson PT, et al. The codon 620 tryptophan allele of the lymphoid tyrosine phosphatase (LYP) gene is a major determinant of Graves’ disease. J Clin Endocrinol Metab. 2004;89(11):5862–5.

34. Vandiedonck C, Capdevielle C, Giraud M, Krumeich S, Jais JP, Eymard B, et al. Association of the PTPN22* R620W polymorphism with autoimmune myasthenia gravis. Ann Neurol. 2006;59(2):404–7.

35. OnengutGumuscu S, Buckner JH, Concannon P. A haplotypebased analysis of the PTPN22 locus in type 1 diabetes. Diabetes. 2006;55(10):2883–9.

36. Cloutier JF, Veillette A. Cooperative inhibition of Tcell antigen receptor signaling by a complex between a kinase and a phosphatase. J Exp Med. 1999;189(1):111–21.

24 Page 25 of 47 Diabetes

37. Cohen S, Dadi H, Shaoul E, Sharfe N, Roifman CM. Cloning and characterization of a lymphoidspecific, inducible human protein tyrosine phosphatase, Lyp. Blood. 1999;93(6):2013– 24.

38. Cloutier JF, Veillette A. Association of inhibitory tyrosine protein kinase p50csk with protein tyrosine phosphatase PEP in T cells and other hemopoietic cells. EMBO J. 1996;15(18):4909.

39. Zhang J, Zahir N, Jiang Q, Miliotis H, Heyraud S, Meng X, et al. The autoimmune diseaseassociated PTPN22 variant promotes calpainmediated Lyp/Pep degradation associated with lymphocyte and dendritic cell hyperresponsiveness. Nat Genet. 2011;43(9):902–7.

40. Vang T, Congia M, Macis MD, Musumeci L, Orrú V, Zavattari P, et al. Autoimmune associated lymphoid tyrosine phosphatase is a gainoffunction variant. Nat Genet. 2005;37(12):1317–9.

41. Rieck M, Arechiga A, OnengutGumuscu S, Greenbaum C, Concannon P, Buckner JH. Genetic variation in PTPN22 corresponds to altered function of T and B lymphocytes. J Immunol. 2007;179(7):4704–10.

42. Arechiga AF, Habib T, He Y, Zhang X, Zhang ZY, Funk A, et al. Cutting edge: the PTPN22 allelic variant associated with autoimmunity impairs B cell signaling. J Immunol. 2009;182(6):3343–7.

43. Vang T, Liu WH, Delacroix L, Wu S, Vasile S, Dahl R, et al. LYP inhibits Tcell activation when dissociated from CSK. Nat Chem Biol. 2012;8(5):437–46.

44. Ramagopalan SV, Dyment DA, Cader MZ, Morrison KM, Disanto G, Morahan JM, et al. Rare variants in the CYP27B1 gene are associated with multiple sclerosis. Ann Neurol. 2011;70(6):881–6.

25 Diabetes Page 26 of 47

TABLE 1. Single-marker family-based association analysis of variants in the PTPN22 gene

Exon rsID cDNA Amino acid 22 12 11 MAF Famil Z P value

chang change y (n)

e

11 ss5388 c.878_ p.Gln293Ar 14,210 5 0 0.00018 2 0.447 0.65

19444 881del gfs*45

AGAT

13 rs3718 c.1040 p.Ile348Serf 14,184 6 0 0.00008 1 1.000 0.32

65329 delA s*14 8

14 rs2476 c.1858 p.Arg620Tr 10,600 3,30 341 0.12 1,023 12.489 8.62E36

601 C>T p 3

18 rs7416 c.2242 p.Arg748Gl 14,154 22 0 0.00053 7 2.191 0.028

3663 A>G y

18 rs5604 c.2250 p.Lys750As 13,943 284 0 0.0087 93 3.190 0.0014

8322 G>C n

22, homozygous for the major allele; 12, heterozygous; 11, homozygous for the minor allele;

MAF, minor allele frequency; n, number of informative families. cDNA and amino acid changes

are based on NM_015967.5 and NP_057051.3, respectively. Z scores and p values are calculated

for the minor alleles.

26 Page 27 of 47 Diabetes

TABLE 2. Family-based haplotype association analysis of PTPN22

Haploty ss5388 rs3718 rs2476 rs7416 rs5604 Famil Freq. Z P value pe 19444 65329 601 3663 8322 y (n)

H1 2 2 2 2 2 0.872 1,072 12.98 1.68E38

H2 2 2 1 2 2 0.119 1,012 12.24 1.81E34

H3 2 2 2 2 1 0.008 92 3.19 0.0014

H4 2 2 1 1 2 0.001 7 2.19 0.028

<0.00 H5 1 2 2 2 2 2 0.45 0.65 1

<0.00 H6 2 1 2 2 2 1 1.00 0.32 1

1, minor allele; 2, major allele; freq., frequency; n, number of informative families.

27 Diabetes Page 28 of 47

FIG. 1. Relative expression levels of three PTPN22 transcripts. (A) Schematic diagrams of the locations of c.2250G>C (rs56048322) and primers (indicated by arrows) used for quantitative realtime PCR. Transcript 1 depicts fulllength PTPN22, transcript 2 an alternative PTPN22 transcript that extends into intron 18, and transcript 3 an alternative PTPN22 transcript that skips exon 18 by splicing exon 17 to exon 19. (B) Characteristics of human subjects. RA, rheumatoid arthritis; RP, relapsing polychondritis. (C) A heat map of the relative expression levels of the three PTPN22 transcripts. CT values were calculated by subtracting the mean CT of GAPDH

CT from the mean CT of PTPN22. The 2 values are represented by colors: dark red indicates high

relative expression levels, and light yellow low relative expression levels; gray indicates values below the detection limit of the assay. The pedigree information of the Blymphoblastoid cell

lines is shown (white circle = unaffected female; black circle = T1Daffected female; black

square = T1Daffected male). c.2250G/C heterozygous samples are marked in red. All the

subjects and cell lines are homozygous for the major, nonrisk allele of rs2476601 in PTPN22.

28 Page 29 of 47 Diabetes

FIG. 2. Relative expression levels of three PTPN22 transcripts in human primary CD4+ T cells

upon T cell receptormediated stimulation. (A) Information on the human subjects in panel (B

D). Carriers of the minor, risk allele of rs56048322 are shown in red. All the subjects are

homozygous for the major, nonrisk allele of rs2476601 in PTPN22. RA, rheumatoid arthritis.

(BD) Primary CD4+ T cells from five human subjects were stimulated with antiCD3 and anti

CD28 antibodies for 0, 6, 18, and 24 hours. The relative expression levels of three PTPN22

+ transcripts (depicted in Fig. 1A) in the stimulated CD4 T cells are shown. CT values were

CT calculated by subtracting the mean CT of GAPDH from the mean CT of PTPN22, and the 2

values are shown.

29 Diabetes Page 30 of 47

FIG. 3. Characterization of the effects of c.2250G>C on LYP. (A) Immunoblotting of wholecell

lysates from the following cell lines: Jurkat, GM12155, four Blymphoblastoid cell lines derived

from two T1Daffected families, and HEK293T. The pedigree information and genotypes for

c.2250G>C (rs56048322) are shown (white circle = unaffected female; black circle = T1D

affected female; black square = T1Daffected male). The c.2250G/C heterozygous genotype is

highlighted in bold. LYP was detected using an antiLYP antibody. The blot was stripped and re probed with an antiγTubulin antibody. (B) Wholecell lysates from B3.1 and B3.2 were

immunoprecipitated with a rabbit antiCSK antibody or normal rabbit IgG. The

immunoprecipitated samples and the wholecell lysates were subjected to immunoblotting with

an antiLYP antibody. The blot was stripped and reprobed with an antiCSK antibody.

30 Page 31 of 47 Diabetes

FIG. 4. The risk allele of rs56048322 affects T cell receptormediated intracellular calcium flux

in CD4+ T cells from subjects with T1D or RA. Previously frozen peripheral blood mononuclear

cells from a healthy subject and from subjects with T1D or RA were loaded with cellpermeant

indo1 AM dye and stained with antibodies against T cell surface markers. Shown are kinetics

profiles depicting the mean indo1 ratio (violet/blue) as a function of time in gated total CD4+ T

cells before and after stimulation with 10 g/ml antiCD3 antibody. Percent maximal calcium

flux is expressed as [(peak antiCD3 flux – mean baseline)/(peak ionomycin flux – mean

baseline)] x100. All the subjects are homozygous for the major, nonrisk allele of rs2476601 in

PTPN22. RA, rheumatoid arthritis; T1D, type 1 diabetes.

31 Diabetes Page 32 of 47

86x60mm (300 x 300 DPI)

Page 33 of 47 Diabetes

96x72mm (300 x 300 DPI)

Diabetes Page 34 of 47

132x243mm (300 x 300 DPI)

Page 35 of 47 Diabetes

91x63mm (300 x 300 DPI)

Diabetes Page 36 of 47

Targeted Deep Sequencing in Multiple-Affected Sibships of European Ancestry

Identifies Rare Deleterious Variants in PTPN22 that Confer Risk for Type 1

Diabetes

Justification on the necessity of the online supplement

The supplementary tables and figures do not warrant inclusion in the main text, but they provide additional information related to the methods and data of the main text.

Supplementary Table 1 lists the primers used in the quantitative real-time PCR (Fig. 1 and 2). Supplementary Table 2 lists the protein-coding genes that were captured and deeply sequenced in this study. Supplementary Table 3 shows the distributions of HLA haplotypes in the 70 T1D cases sequenced in this study and in the initial T1DGC population from which the 70 cases were selected. Supplementary Figures 1 and 2 show intracellular calcium flux profiles in CD4+ T cells from subjects without autoimmune diseases and in CD19+ B cells, respectively. Page 37 of 47 Diabetes

Supplementary Table 1. Primers used for quantitative RT-PCR

Transcrip Forward primer (5’ to 3’) Reverse primer (5’ to 3’)

t

PTPN22 TCCTGACACCATGGAAAATTCA GGTGGATTCCTTGGTCCTTTG

transcript

1

PTPN22 GTGTAAAACTCCGAAGTCCTAAA CGGCATGTTTCCCAAAACTCT

transcript TCAG T

2

PTPN22 TCTTGCCGATGAAGATTAGTTTG CCAAAATTCAGAAATGAGCT

transcript AA GGA

3

GAPDH CAGCCGAGCCACATCGC CATGGGTGGAATCATATTGG

AACA

PTPN22 transcript 1 is the full-length PTPN22 transcript, PTPN22 transcript 2 the intron-

18-run-on transcript, and PTPN22 transcript 3 the exon-18-skipping transcript (see Fig.

1A).

Diabetes Page 38 of 47

Supplementary Table 2. Known protein-coding genes in T1D-associated regions targeted in deep sequencing

T1D-associated Protein-coding genes regions

1p13.2 OLFML3, DCLRE1B, AP4B1, BCL2L15, PTPN22, HIPK1,

RSBN1, PHTF1, MAGI3, SYT6

1q31.2 RGS1

1q32.1 IL20, IL19, IL10, MAPKAPK2, DYRK3

2q11.2 CHST10, LONRF2, AFF3

2q12.1 SLC9A4, IL18RAP, IL18R1, IL1RL1, IL1RL2

2q24.2 GCA, IFIH1, FAP, KCNH7, GCG

2q32.2 STAT4

2q33.2 ICOS, CTLA4

3p21.31 TDGF1, LRRC2, RTP3, LTF, CCRL2, CCR5, CCR3, CCR2,

CCR1, XCR1, CXCR6, FYCO1

4p15.2 -

4q27 IL21, ADAD1, IL2, KIAA1109

5p13.2 UGT3A2, UGT3A1, CAPSL, IL7R, SPEF2

6q15 BACH2

6q22.32 CENPW

6q23.3 TNFAIP3

6q25.3 TAGAP, RSPH3, C6orf99 Page 39 of 47 Diabetes

7p15.2 HOXA7, HOXA6, HOXA5, HOXA4, HOXA3, HOXA2, HOXA1,

SKAP2, C7orf71, HOXA9

7p12.1 COBL

9p24.2 GLIS3

10p15.1 PFKFB3, RBM17, IL2RA

10p15.1 PRKCQ

10q22.3 ZMIZ1

10q23.31 RNLS

11p15.5 ASCL2, TH, INS, IGF2-AS, IGF2

12p13.31 CD69, CLECL1, CLEC2D, KLRB1

12q13.2 APOF, STAT2, IL23A, PAN2, CNPY2, CS, COQ10A,

ANKRD52, SLC39A5, NABP2, RNF41, SMARCC2, MYL6,

MYL6B, ZC3H10, ESYT1, PA2G4P4, IKZF4, RPS26P53,

ERBB3, SUOX, RAB5B, CDK2, PMEL

12q13.3 CTDSP2, AVIL, TSFM, METTL21B, METTL1, MARCH9,

CYP27B1, CDK4, AGAP2, TSPAN31, OS9, B4GALNT1,

ARHGEF25, DTX3, PIP4K2C, KIF5A, DCTN2, MBD6, MARS,

DDIT3, ARHGAP9, GLI1, INHBE, INHBC, R3HDM2, STAC3,

NDUFA4L2, SHMT2, NXPH4, LRP1, NAB2, STAT6,

TMEM194A, MYO1A, TAC3, ZBTB39, GPR182, RDH16,

SDR9C7, HSD17B6, PRIM1, NACA, PTGES3, ATP5B, BAZ2A,

RBMS2P1, XRCC6BP1

12q24.12 PTPN11, RPL6P27, HECTD4, TRAFD1, NAA25, ERP29, Diabetes Page 40 of 47

TMEM116, MAPKAPK5, ALDH2, ACAD10, BRAP, SH2B3,

ATXN2, FAM109A, CUX2, MYL2, CCDC63, RPH3A

14q24.1 ZFP36L1, C14orf181*

14q32.2 -

14q32.2 -

15q14 RASGRP1, C15orf53

15q25.1 CTSH, RASGRF1, MORF4L1, ADAMTS7

16p13.13 RMI2, PRM1, PRM2, PRM3, TNP2, SOCS1, CLEC16A, DEXI,

CIITA, LITAF

16p11.2 NFATC2IP, SPNS1, CD19, RABEP2, ATP2A1, SH2B1,

ATXN2L, TUFM, EIF3C, SULT1A2, SULT1A1, CCDC101,

IL27, APOBR, CLN3, EIF3CL, SBK1, LAT

16q23.1 CHST6, TMEM170A, CFDP1, BCAR1, CTRB1, CTRB2, ZFP1

17q12 THRA, MED24, CSF3, GSDMA, PSMD3, ORMDL3, GSDMB,

ZPBP2, IKZF3, GRB7, MIEN1, ERBB2, PGAP3, PNMT, TCAP,

PPP1R1B, STARD3, NEUROD2, CDK12, MED1, FBXL20,

STAC2, NR1D1

17q21.2 KRT24, KRT222, SMARCE1

18p11.21 PTPN2

18q22.2 CD226, DOK6

19p13.2 S1PR5, KEAP1, PDE4A, CDC37, TYK2, ICAM3, RAVER1,

FDX1L, ZGLP1, ICAM5, ICAM4, ICAM1

19q13.32 FKRP, STRN4, PRKD2, DACT3, SLC1A5 Page 41 of 47 Diabetes

20p13 SIRPG, SIRPB1, SIRPD

21q22.3 TMPRSS3, UBASH3A

22q12.2 LIF, HORMAD2, MTMR3, ASCC2, UQCR10, ZMAT5, CABP7,

NF2, THOC5, NIPSNAP1, OSM, NEFH, RFPL1

22q12.3 IL2RB

22q13.1 RAC2, SSTR3, C1QTNF6

Xp22.2 TMSB4X, TLR8

Xq28 BRCC3, CMC4, MTCP1, FUNDC2, H2AFB1, F8A1, F8,

MPP1, DKC1, GAB3, CTAG2, CTAG1B, VBP1

*C14orf181 was withdrawn from the online repository of HGNC-approved genes.

(http://www.genenames.org/ accessed on April 19, 2013).

Diabetes Page 42 of 47

Supplementary Table 3. Distribution of HLA DR-DQ haplotypes associated with type 1 diabetes

Exon T1DGC Chi- Seq EUR square HLA * DRB1 DQA1 DQB1 Freq.† Freq.* P value Risk Haplotype 1 (S1) 04:05 03:01 03:02 0.021 0.025 1.00 Risk Haplotype 2 (S2) 04:01 03:01 03:02 0.264 0.281 0.74 Risk Haplotype 3 (S3) 03:01 05:01 02:01 0.307 0.341 0.48 Risk Haplotype 4 (S4) 04:02 03:01 03:02 0.043 0.035 0.80 Risk Haplotype 5 (S5) 04:04 03:01 03:02 0.064 0.050 0.61 Protective Haplotype 1 (P1) 07:01 02:01 03:03 0 0.001 1.00 Protective Haplotype 2 (P2) 14:01 01:01 05:03 0 0 / Protective Haplotype 3 (P3) 15:01 01:02 06:02 0 0.004 0.98 Protective Haplotype 4 (P4) 11:04 05:01 03:01 0 0.002 1.00 Protective Haplotype 5 (P5) 13:03 05:01 03:01 0 0.001 1.00 Risk Haplotype 1-5 0.700 0.732 0.42 Protective Haplotype 1-5 0 0.008 0.58 DR3/DR4 Diplotype 0.400 0.394 1.00

*Information on the risk DR-DQ haplotypes (S1-S5), the protective DR-DQ haplotypes

(P1-P5), and the haplotype/diplotype frequencies in 606 probands of T1D families of

European ancestry ascertained by the Type 1 Diabetes Genetics Consortium was obtained from Reference (1).

†Frequencies in the 70 T1D cases that were deeply sequenced in this study. Page 43 of 47 Diabetes

Supplementary Figure 1. The risk allele of rs56048322 does not impair T cell receptor-

mediated intracellular calcium flux in CD4+ T cells from subjects without autoimmune

diseases. Previously frozen peripheral blood mononuclear cells were loaded with cell-

permeant indo-1 AM dye and stained with antibodies against T cell surface markers.

Shown are kinetics profiles depicting the mean indo-1 ratio (violet/blue) as a function of

time in gated total CD4+ T cells before and after stimulation with 10 µg/ml of anti-CD3

antibody. Percent maximal calcium flux is expressed as [(peak anti-CD3 flux – mean

baseline)/(peak ionomycin flux – mean baseline)] x100. All the subjects are homozygous

for the major, non-risk allele of rs2476601 in PTPN22.

Diabetes Page 44 of 47

Supplementary Figure 2. The risk allele of rs56048322 does not alter B cell receptor- mediated intracellular calcium flux in human primary CD19+ B cells. Previously frozen peripheral blood mononuclear cells were loaded with cell-permeant indo-1 AM dye and surface-stained with anti-CD19. Shown are kinetics profiles depicting the mean indo-1 ratio (violet/blue) as a function of time in gated total CD19+ B cells before and after stimulation with 20 µg/ml anti-κ + anti-λ F(ab’)2. Percent maximal calcium flux is expressed as [(peak anti-κ/λ flux – mean baseline)/(peak ionomycin flux – mean baseline)] x100. All the subjects are homozygous for the major, non-risk allele of rs2476601 in

PTPN22. T1D, type 1 diabetes. Page 45 of 47 Diabetes

References:

1. Erlich H, Valdes AM, Noble J, Carlson JA, Varney M, Concannon P, et al. HLA DR- DQ haplotypes and genotypes and type 1 diabetes risk: analysis of the type 1 diabetes genetics consortium families. Diabetes. 2008 Apr;57(4):1084–92.

Diabetes Page 46 of 47

92x61mm (300 x 300 DPI)

Page 47 of 47 Diabetes

91x62mm (300 x 300 DPI)