Role for Msh5 in the regulation of Ig class switch recombination

Hideharu Sekinea, Ricardo C. Ferreirab,c, Qiang Pan-Hammarstro¨ md, Robert R. Grahame, Beth Ziembab, Sandra S. de Vriesf, Jiabin Liub, Keli Hippenb, Thearith Koeuthb, Ward Ortmannb,c, Akiko Iwahoria, Margaret K. Elliotta, Steven Offerb, Cara Skonb, Likun Dud, Jill Novitzkeb, Annette T. Leeg, Nianxi Zhaoh, Joshua D. Tompkinsh, David Altshulere, Peter K. Gregerseng, Charlotte Cunningham-Rundlesi, Reuben S. Harrisb, Chengtao Herh, David L. Nelsonj, Lennart Hammarstro¨ md, Gary S. Gilkesona, and Timothy W. Behrensb,c,k

aMedical University of South Carolina, Charleston, SC 29425; bUniversity of Minnesota Medical School, Minneapolis, MN 55455; dKarolinska University Hospital, SE-141 86 Huddinge, Sweden; eBroad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142; fThe Netherlands Cancer Institute, 1066 CX, Amsterdam, The Netherlands; gFeinstein Institute for Medical Research, Manhasset, NY 11030; hWashington State University, Pullman, WA 99164; iMount Sinai School of Medicine, New York, NY 10029; and jNational Cancer Institute, Bethesda, MD 20892

Communicated by Richard H. Scheller, Genentech, Inc., South San Francisco, CA, February 19, 2007 (received for review December 8, 2006) Ig class switch recombination (CSR) and somatic hypermutation by forming heterodimers; Msh2-Msh6 (MutS␣), Msh2-Msh3 serve to diversify antibody responses and are orchestrated by the (MutS␤), Msh4-Msh5 (MutS␥), Mlh1-Pms2 (MutL␣), and Mlh1- activity of activation-induced cytidine deaminase and many pro- Mlh3 (MutL␥) (3). MutS heterodimers are thought to recruit MutL teins involved in DNA repair and genome surveillance. Msh5,a heterodimers. Experiments using Mut homologue -knockout gene encoded in the central MHC class III region, and its obligate (KO) mice revealed that Msh2-, Msh6-, Mlh1-, and Pms2-deficient heterodimerization partner Msh4 have a critical role in regulating animals had decreased efficiency of CSR and somatic hypermuta- meiotic and have not been implicated tion (4). Deficiencies of MutS and MutL often result in in CSR. Here, we show that MRL/lpr mice carrying a congenic H-2b/b differences in microhomology lengths at S joints and show three MHC interval exhibit several abnormalities regarding CSR, includ- phenotypes, decreased (Msh2Ϫ/Ϫ and Mlh3Ϫ/Ϫ) (4, 5), no change ing a profound deficiency of IgG3 in most mice and long micro- (Msh6Ϫ/Ϫ) (6), or increased (Mlh1Ϫ/Ϫ and Pms2Ϫ/Ϫ) (5, 7) micro- homologies at Ig switch (S) joints. We found that Msh5 is expressed homology. The differences in S joint phenotypes between Msh2Ϫ/Ϫ at low levels on the H-2b haplotype and, importantly, a similar long mice and Mlh1Ϫ/Ϫ or Pms2Ϫ/Ϫ mice suggest the existence of other S joint microhomology phenotype was observed in both Msh5 and that function in the same pathway of CSR as Mlh1 and Msh4-null mice. We also present evidence that genetic variation in Pms2. MSH5 is associated with IgA deficiency and common variable Msh5 and Msh4 are involved in the resolution of DNA Holliday immune deficiency (CVID) in humans. One of the human MSH5 junctions, the four-stranded DNA structures that form during alleles identified contains two nonsynonymous polymorphisms, homologous recombination in (8). Msh4 and Msh5 KO and the variant encoded by this allele shows impaired mice are sterile due to an inability to resolve these meiotic chro- binding to MSH4. Similar to the mice, Ig S joints from CVID and IgA mosomal crossovers (9–11). deficiency patients carrying disease-associated MSH5 alleles show Based on these studies in mice, the Mut homologues are attrac- increased donor/acceptor microhomology, involving pentameric tive candidate genes for human Ig deficiencies. Selective IgA DNA repeat sequences and lower mutation rates than controls. Our deficiency (IgAD) (serum IgA Ͻ0.05 g/liter) is the most common findings suggest that Msh4/5 heterodimers contribute to CSR and primary immunodeficiency disorder in man, with a prevalence of support a model whereby Msh4/5 promotes the resolution of DNA Ϸ1/600 Caucasian individuals (12). The selective nature of the CSR breaks with low or no terminal microhomology by a classical defect in IgAD is not understood. Common variable immune nonhomologous end-joining mechanism while possibly suppress- deficiency (CVID) is a more severe disease and affects Ϸ1/25,000 ing an alternative microhomology-mediated pathway. Caucasians. Patients show a marked reduction in serum levels of both IgG (usually Ͻ3 g/liter) and IgA (Ͻ0.05 g/liter), together with ͉ ͉ immunoglobulin subclass deficiency mismatch repair Msh4 reductions of IgM in about half the cases (Ͻ0.3 g/liter). CVID patients have a high incidence of infectious complications and, fter appropriate stimulation, B cells undergo class switch paradoxically, are prone to autoimmune disorders (13). Arecombination (CSR), whereby the functionally rearranged The available evidence suggests a common genetic basis for V(D)J DNA segment is recombined with a downstream Ig constant IgAD and CVID (14) and individuals with IgAD may transition region segment. The biochemistry of CSR is complex and involves into CVID. Haplotypes of the MHC show genetic association with the B cell-specific gene activation-induced cytidine deaminase, IgAD, notably HLA (HLA) A1-B8-DR3 and B14-DR1 (15–17). which initiates both CSR and somatic hypermutation (1). CSR also requires many ubiquitously expressed genes important for detecting DNA mismatches and breaks and regulating DNA repair (2). CSR Author contributions: H.S. and R.C.F. contributed equally to this work; H.S., R.C.F., L.H., occurs at specific DNA segments called switch (S) regions, which lie G.S.G., and T.W.B. designed research; H.S., R.C.F., B.Z., J.L., K.H., T.K., A.I., M.K.E., S.O., C.S., L.D., N.Z., and J.D.T. performed research; Q.P.-H., S.S.d.V., J.N., A.T.L., D.A., P.K.G., C.C.-R., upstream of each constant region and contain hotspots for activa- C.H., D.L.N., and L.H. contributed new reagents/analytic tools; H.S., R.C.F., Q.P.-H., R.R.G.,

tion-induced cytidine deaminase-mediated cytosine deamination. B.Z., J.L., K.H., W.O., R.S.H., G.S.G., and T.W.B. analyzed data; and H.S., R.C.F., G.S.G., and IMMUNOLOGY The ligation of the S␮ region with the downstream S regions is T.W.B. wrote the paper. carried out by protein factors that comprise the nonhomologous The authors declare no conflict of interest. end joining machinery for DNA repair (1, 2). Abbreviations: CSR, class switch recombination; CVID, common variable immune defi- Mismatch repair proteins play a critical role in safeguarding ciency; IgAD, IgA deficiency; KO, knockout. genetic stability. The key proteins for initiation of eukaryotic cPresent address: Genentech, Inc., 1 DNA Way, South San Francisco, CA 94080. mismatch repair are homologues of bacterial MutS and MutL. In kTo whom correspondence should be addressed. E-mail: [email protected]. mammals, there are five MutS (Msh2, Msh3, Msh4, Msh5, and This article contains supporting information online at www.pnas.org/cgi/content/full/ Msh6) and four MutL (Mlh1, Mlh3, Pms1, and Pms2) homologues. 0700815104/DC1. Each Mut homologue acts at the DNA repair or recombination site © 2007 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0700815104 PNAS ͉ April 24, 2007 ͉ vol. 104 ͉ no. 17 ͉ 7193–7198 Downloaded by guest on September 28, 2021 b/b Fig. 1. Serum IgG3 deficiency, Msh5 gene expression, and CSR in H-2 congenic MRL/lpr mice. (A) Map of the 129/Sv congenic interval in F9 and FՆ20 congenic H-2b/b MRL/lpr mice. The microsatellite markers and gene polymorphisms used to characterize the introgressed region are shown. (B) Serum IgG3 levels in the k/k b/k b/b F9 H-2 , H-2 , and H-2 MRL/lpr mice. n ϭ 12–16 mice in each group at 12 weeks of age. The number of mice in each group decreased with aging because of mortality. Bars indicate mean values. (C) Msh5 mRNA expression levels were measured in cDNA from splenic B cells of H-2k/k MRL/lpr mice and IgGpos and IgGneg H-2b/b MRLlpr congenic mice (n ϭ 3 each) (D) CSR of splenic B cells was induced in vitro with LPS for class switch induction to IgG3. Representative FACS plots show the percentage of CD19ϩ IgG3 positive cells from IgG3pos H-2k/k and IgG3neg H-2b/b MRL/lpr mice. Numbers shown are average percentage Ϯ SEM switched cells for three mice in each group. (E) IgG3pos H-2k/k and IgG3neg H-2b/b MRL/lpr mice were immunized with TNP-LPS or TNP-Ficoll, and IgG2b (Right) and IgG3 (Left) anti-TNP responses were measured at 2 weeks. Serum OD380 values are represented on the y axis. Data shown represent the mean Ϯ SEM; n ϭ 10 in each group. (F) Msh5 expression profile in BALB/c (H-2d)(n ϭ 4), 129/Sv (H-2b)(n ϭ 2), C57BL/6 (H-2b)(n ϭ 3), and FVB (H-2q)(n ϭ 3) mice, using quantitative PCR (mean Ϯ SEM). (C and F) Data represent relative Msh5 mRNA copy numbers when compared with resting B cells from H-2k/k MRL/lpr mice (H-2k/k MRL/lpr ϭ 100%; mean Ϯ SEM). *, P Ͻ 0.05; **, P Ͻ 0.01; ***, P Ͻ 0.001. P values were calculated by using two-tailed Student’s t tests.

Homozygosity for the A1-B8-DR3 haplotype is a particularly strong antibody phenotypes were similar in congenic H-2b/b MRL/lpr risk factor for IgAD in Caucasians, with an incidence reported as animals backcrossed nine generations, and those animals back- high as 13% (18). Whereas the association of IgAD and CVID with crossed Ͼ20 generations (data not shown), demonstrating that the the MHC is clearly documented, the identity of the genetic effect(s) genetic effect is stable, shows consistent incomplete penetrance, within the MHC remains controversial, with studies suggesting that and is localized to the H-2 region. class II molecules and/or genes in the centromeric class III region are involved (17, 19, 20). Other genes that contribute to CVID Hypomorphic Allele of Msh5 on the H-2b Haplotype. To identify the include rare mutations in the T cell costimulatory molecule ICOS gene(s) from the H-2 region contributing to the IgG3 deficiency, we (21) and TACI (TNFRSF13B) (22, 23). used gene expression microarrays to assay spleen RNA from Here, we provide evidence that Msh5 contributes to dysregulated 8-week-old congenic IgG3pos H-2b/b, IgG3neg H-2b/b, and H-2k/k Ig CSR in mice and identify a possible role for MSH5 in human MRL/lpr littermates. Essentially all of the significant differences in IgAD and CVID. gene expression, with the exception of IgG3 mRNA, were genes encoded within the MHC congenic interval. IgG3 mRNA expres- Results sion was significantly higher in H-2k/k MRL/lpr mice (average H-2b Congenic MRL/lpr Mice Show Defects in CSR. We generated 36,385 affymetrix expression units) compared to IgG3neg H-2b/b H-2b/b congenic MRL/lpr mice by introgressing the H-2b MHC mice (average 2,837 affymetrix expression units; P ϭ 1 ϫ 10Ϫ4)(SI haplotype from 129/Sv mice onto the MRL/lpr background. After Table 2). The H-2 Ea gene is deleted on the H-2b haplotype (25) nine generations of backcrossing, animals were genotyped for 136 and showed low expression in the H-2b/b congenic spleens. Expres- polymorphic microsatellites, which confirmed that all markers sion differences were also observed for other class I and II MHC outside the H-2 region were MRL/lpr derived. The congenic H-2b genes, which likely reflect polymorphisms between the H-2b and interval measured Ϸ13 Mb and included the entire MHC region H-2k haplotypes. Msh5, which is located in the MHC class III region (Fig. 1A). H-2b/b MRL/lpr mice exhibited no differences in disease (Fig. 1A), was the only other differentially expressed gene within compared with wild-type animals (24). Strikingly, however, 11/16 the MHC region and showed Ϸ6-fold lower expression in the H-2b (68%) H-2b/b congenics had undetectable serum IgG3 antibodies congenic spleens (IgG3pos H-2b/b, 100 expression units; IgG3neg (Fig. 1B), diminished levels of serum IgA antibodies in older mice, H-2b/b, 135 expression units) compared with the wild-type H-2k/k together with elevated serum levels of IgM and IgG2a antibodies. mice (668 expression units; P ϭ 9 ϫ 10Ϫ3 vs. IgG3pos H-2b/b and P ϭ Ϫ Serum IgG1 and IgG2b levels were similar in H-2b/b and H-2k/k 8.2 ϫ 10 3 vs. IgG3neg H-2b/b)(SI Table 2). The microarray MRL/lpr mice [supporting information (SI) Fig. 5]. The deficiency expression results for Msh5 were confirmed by using TaqMan of IgG3 in the H-2b/b congenics was confirmed by ELISpot assays real-time quantitative PCR (Fig. 1C). of splenic antibody secreting cells (SI Fig. 6). Importantly, the We next evaluated the influence of the low Msh5 levels on B cell

7194 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0700815104 Sekine et al. Downloaded by guest on September 28, 2021 Table 1. Association of MSH5 alleles with CVID and IgAD in Sweden and the U.S. MSH5 alleles

L85F͞P786S rs3131378 Cohort (exons 3, 24) (intron 12)

Allele frequency (n)* Sweden Controls (N ϭ 396)† 1.8% (7) 11.9% (47) IgAD (N ϭ 414) 3.6% (15) 31.4% (128) Fig. 2. Increased microhomology at S␮-S␥3 junctions in IgG3neg H-2b/b congenic P ϭ 0.104‡ P ϭ 2.1 ϫ 10-1 MRL/lpr, Msh5Ϫ/Ϫ FVB, and Msh4Ϫ/Ϫ C57BL/6 mice. S joints were amplified from CVID (N ϭ 166) 2.4% (4) 15.7% (26) three 6- to 8-week-old mice in the H-2k/k MRL/lpr, IgG3pos H-2b/b MRL/lpr, wild-type P ϭ 0.22 FVB, Msh5Ϫ/Ϫ FVB, and wild-type C57BL/6 groups, and four 6- to 8-week-old mice U.S. in the IgG3neg H-2b/b MRL/lpr and Msh4Ϫ/Ϫ C57BL/6 groups. Each dot represents Controls (N ϭ 976) 3.2% (31) 9.7% (95) the number of nucleotides of donor/acceptor identity at the junction for an IgAD (N ϭ 6) 50% (3) 0 (0) individual S joint. P values were calculated by using two-tailed Mann–Whitney CVID (N ϭ 204) 5.4% (11) 13.2% (27) tests. P ϭ 0.12 P ϭ 0.135 Pooled odds ratio (confidence interval, 95%)§ class switching in vitro. Surprisingly, there were no differences in the Combined ϩ All IgAD 2.85 (1.24–6.51) 3.28 (2.28–4.72) efficiency of in vitro switching to IgG3 (LPS) or IgG1 (LPS IL-4) (N ϭ 420 cases, 1,372 controls) P ϭ 5.8 ϫ 10Ϫ3 P ϭ 7.9 ϫ 10Ϫ11 neg b/b k/k between the IgG3 H-2 congenic and control H-2 MRL/lpr All CVID 1.63 (0.88–3.02) 1.40 (1.00–1.98) b B cells (Fig. 1D and data not shown). The ability of H-2 congenic (N ϭ 370 cases, 1,372 controls) P ϭ 0.058 P ϭ 0.026 B cells to switch in vitro is reminiscent of human IgAD, where in All IgAD and CVID 2.04 (1.29–3.30) 2.15 (1.69–2.73) vitro stimulation of B cells from IgAD patients with CD40 and IL-10 (N ϭ 790 cases, 1,372 controls) P ϭ 1.8 ϫ 10Ϫ3 P ϭ 2.6 ϫ 10Ϫ10 induces normal levels of IgA secretion (26). Interestingly, immu- ϭ neg b/b *N total number of genotyped in each group. nization of IgG3 H-2 MRL/lpr mice with T-independent †n ϭ number of positive alleles. stimuli TNP-LPS or TNP-Ficoll failed to elicit an IgG3 anti-TNP ‡P values determined by using ␹2 tests. response, whereas IgM (data not shown) and IgG2b responses were §Pooled odds ratios and P values determined by using Mantel-Haenszel tests. intact (Fig. 1E). We then surveyed several mouse strains and identified two distinct groups: high Msh5 expressers [MRL/lpr (H-2k), AKR centromeric of Msh5. Gene targeting for Bf was performed on the (H-2k) (data not shown) and BALB/c (H-2d)] and low Msh5 129/Sv background, and the mice studied were backcrossed seven expressers [129/Sv (H-2b), C57BL/6 (H-2b) and FVB (H-2q)] (Fig. generations onto the C57BL/6 genetic background (28). Impor- 1F). The difference in B cell Msh5 mRNA expression levels tantly, S␮-S␥3 and S␮-S␣ joints from splenic B cells of C57BL/6 Ϫ Ϫ between the two groups was Ϸ100-fold. The higher relative levels wild-type and Bf / C57BL/6 mice showed no significant differ- of Msh5 in the B cells of MRL/lpr H-2b congenics, compared with ences in the lengths of microhomologies (SI Fig. 8 B and C). Second, native H-2b strains, may reflect the activated state of B cells in we evaluated mice deficient for Msh4, the heterodimeric partner of MRL/lpr mice (27). Consistent with this idea, Msh5 was inducible Msh5. Msh4 is located on mouse 3, a location distinct Ϫ Ϫ in B cells from C57BL/6 mice (SI Fig. 7), although induced Msh5 from Msh5. Similar to Msh5 / mice, there were no significant expression levels were at least 10-fold lower than high expressers at differences in serum antibody levels (SI Fig. 9B)orin vitro class baseline. Msh5 expression in B cells from high-expressing strains switching (SI Fig. 10). However, splenic S␮-S␥3 and S␮-S␣ joints Ϫ Ϫ was not inducible (data not shown). We conclude that H-2b/b from Msh4 / mice showed significantly longer microhomologies congenic MRL/lpr mice express a hypomorphic allele of Msh5 and compared with wild-type littermates (Fig. 2 and SI Fig. 8A). Taken hypothesize that the low expression of Msh5 on the MRL back- together, these data strongly support a role for Msh5 and Msh4 in ground contributes to the observed antibody phenotype. the regulation of microhomology at Ig S joints in mice.

Increased Microhomology at IgG3 Switch Junctions in H-2b/b MRL/lpr Genetic Variants of MSH5 Are Associated with Human CVID and IgAD. B Cells. We next asked whether the IgG3 deficiency observed in the The selective antibody isotype deficiency observed in the H-2b/b congenic MRL/lpr animals was accompanied by phenotypic differ- congenic MRL/lpr mice prompted us to investigate MSH5 as a ences in IgG3 S joints. S␮-S␥3 joints, amplified from splenic B cells candidate gene for human IgAD and CVID. MSH5 mRNA is of IgG3neg MRL/lpr H-2b/b mice, showed significantly longer seg- expressed in human tonsillar B lymphocytes and is present at levels ments of microhomology than the S joints from IgG3pos congenic that are higher in CD77ϩ germinal center B cells than in naı¨ve or (P ϭ 0.0012) or H-2k/k MRL/lpr B cells (P ϭ 0.0014; Fig. 2). The memory B cells (29) (SI Fig. 11). Using quantitative PCR, we H-2b/b IgG3neg mice also had longer microhomology segments at confirmed the constitutive expression of MSH5 mRNA in both S␮-S␣ joints (SI Fig. 8A). No other significant abnormalities were purified peripheral blood human B cells and Epstein–Barr virus observed at the S junctions (SI Tables 3 and 4). transformed B cells (SI Table 5). To extend these findings, we evaluated B cells in Msh5 KO FVB To identify genetic variation in MSH5, we sequenced the 25

mice (9). Serum antibody levels and in vitro switching showed no coding and noncoding exons of MSH5 together with the promoter IMMUNOLOGY significant differences between Msh5Ϫ/Ϫ mice and littermates (SI region in 96 IgAD and CVID cases and identified five nonsynony- Fig. 9A and data not shown). Importantly, the S␮-S␥3 (Fig. 2) and mous polymorphisms and a number of SNPs in noncoding regions S␮-S␣ (SI Fig. 8A) joints from splenic B cells of Msh5Ϫ/Ϫ mice of the gene (SI Fig. 12 and data not shown). To determine whether showed significantly increased microhomology compared with the identified variation in MSH5 contributed genetic susceptibility wild-type littermates. To address the concern that other genes in to IgAD or CVID, we genotyped relevant SNPs in 207 Swedish tight linkage disequilibrium with Msh5 might be contributing to the IgAD cases, 83 Swedish CVID cases, and 198 Swedish control cases observed phenotype, in both the congenics and the Msh5 KOs, we and compared allele frequencies (Table 1). took two approaches. First, we studied KO mice for complement We identified two rare nonsynonymous SNPs: a G/T SNP, Factor B (Bf), another MHC class III region gene located Ͻ500 Kb Q292H, in exon 11, which was present in one Swedish CVID case

Sekine et al. PNAS ͉ April 24, 2007 ͉ vol. 104 ͉ no. 17 ͉ 7195 Downloaded by guest on September 28, 2021 and absent in all controls tested; and a G/T SNP, C580G, in exon 19, which was present in 2 of 212 IgAD patients and not found in either controls or CVID cases (SI Table 6). We also identified SNPs in exon 3 (L85F, rs28381349) and exon 24 (P786S, rs28399984). Interestingly, the L85F SNP was always found together with the P786S SNP (DЈϭ1), indicating they are located on the same chromosomal segment. By oligotyping HLA-B and DR alleles in the Swedish cases, we determined that the L85F/P786S allele is present on the ancestral HLA B14-DR1 haplotype. Fourteen of 16 L85F/P786S cases (88%) were DR1 and/or B14 positive (SI Fig. 12). Importantly, B14-DR1 is one of the MHC haplotypes that has shown strong genetic association with IgAD and CVID (16). The MSH5 L85F/P786S allele was present in the Swedish controls at a frequency of 1.8% and was enriched Ϸ2-fold in IgAD cases (3.6%) Fig. 3. MSH5 L85F/P786S variant has reduced binding affinity to MSH4. Yeast and to a lesser extent in CVID (2.4%). These differences did not two-hybrid assays were performed to assess the ability of the wild-type MSH5 reach statistical significance (Table 1). Another nonsynonymous and MSH5 L85F/P786S variant proteins, fused with either a LexA DNA binding SNP, P29S, in exon 2 (rs2075789) was a frequent polymorphism in domain (BD) or a Gal4 activation domain (AD) to interact with wild-type MSH4. The strength of interaction was measured by using a liquid ␤-galacto- the control population (12.1%) and was not enriched in patients (SI Ϯ Table 6). sidase assay. Data represent mean SE of nine replicates from three inde- pendent experiments for the AD-MSH5 L85F/P786S interaction with BD-MSH4, SNP rs3131378, located in intron 12 of MSH5, is present on the and 12 replicates for all other conditions in four independent experiments. extended A1-B8-DR3 MHC haplotype. Of 124 haplotypes carrying Western blots of whole yeast lysates confirmed equivalent expression of the SNP rs3131378, 111 were positive for DR3 (90%), and 107/127 were wild-type and MSH5 L85F/P786S proteins (SI Fig. 17). positive for B8 (84%). rs3131378 was strongly associated with IgAD (31.4% allele frequency compared with 11.9% in controls, P ϭ 2.1 ϫ 10Ϫ11) (Table 1). CVID patients showed only a modestly those amplified from three patient groups: CVID patients carrying increased allele frequency of rs3131378 (15.7%). TACI mutations and lacking MSH5 nonsynonymous or DR3 alleles We next typed these SNPs in an independent cohort of 102 (Patient 1, TACI*), DR3ϩϩ patients (Patient 2, DR3ϩϩ), and United States Caucasian CVID cases and 488 U.S. controls. Three patients carrying nonsynonymous alleles of MSH5 (Patient 3, U.S. IgAD cases were also typed. Although not reaching statistical MSH5*). The CVID group carrying TACI mutations tested the significance, the L85F/P786S double missense allele of MSH5 was hypothesis that S joint phenotypes would differ between patients more frequent in the CVID cohort (5.4%) than in controls (3.2%). carrying TACI and MSH5 alleles. All three U.S. IgAD cases typed were heterozygous for the L85F/ Strikingly, CVID patients carrying MSH5 nonsynonymous poly- P786S allele. The MSH5 allele that was present on the extended morphisms displayed significantly longer stretches of S␮-S␣1 mi- DR3 haplotype was also modestly increased in U.S. CVID cases crohomology than the various control groups (median 9 bp vs. 2 bp (13.2% vs. 9.7% in controls) (Table 1). in controls; P ϭ 1.9 ϫ 10Ϫ7) (Fig. 4A). DR3ϩϩ CVID patients In a combined analysis of the Swedish and U.S. cohorts, the demonstrated a similar long microhomology phenotype (median 8 L85F/P786S allele showed significant association with IgAD (P ϭ bp, P ϭ 3 ϫ 10Ϫ4). Importantly, S junction microhomology in 5.8 ϫ 10Ϫ3), borderline significance in CVID (P ϭ 0.058), and CVID patients carrying alleles of the TACI gene and lacking any of evidence for association in the combined IgADϩCVID analysis the disease associated MSH5 alleles showed levels of microhomol- (P ϭ 1.8 ϫ 10Ϫ3). rs3131378 showed strong association in the ogy similar to controls (Fig. 4A and SI Table 7). Thus, the long combined IgAD analysis (P ϭ 7.9 ϫ 10Ϫ11) and significant evidence microhomology phenotype was specific to CVID cases carrying in the pooled CVID analysis (P ϭ 0.026) (Table 1). disease-associated MSH5 alleles. We next performed yeast two-hybrid assays to measure the We also found differences in S joint mutation rates between the interaction of the L85F/P786S variant MSH5 protein with MSH4, groups. In controls lacking disease-associated MSH5 alleles, the because both L85F and P786S are located within identified MSH4- mutation rate across the entire S joints averaged five mutations per interacting domains of MSH5 (SI Fig. 12) (30). Using sensitive 1,000 bp (SI Table 8). In contrast, there were far fewer mutations liquid ␤-galactosidase assays, we found that the L85F/P786S MSH5 in joints from patients carrying MSH5 nonsynonymous polymor- protein variant showed a diminished ability to bind to MSH4 as phisms (1.3 mutations per 1,000 bp; P ϭ 2 ϫ 10Ϫ12)orDR3ϩϩ (0.3 compared with the wild-type MSH5 (Fig. 3). mutations per 1,000 bp; P ϭ 1.3 ϫ 10Ϫ7). Interestingly, mutation Given the background frequency of the MSH5 L85F/P786S allele rates were also significantly lower in the DR3ϩϩ controls (3.2 in the population, it was important to determine whether controls mutations per 1,000 bp, P ϭ 0.01) compared with other control carrying this allele were IgA deficient. We measured IgA levels in groups. Across all of the groups studied, Ϸ90% of the S region the plasma of 11 controls heterozygous for the L85F/P786S allele. mutations were targeted to dG:dC base pairs, but no differences All showed normal IgA levels (ranging from 1.0 to 4.4 g/liter), were observed for the rate of transitions at dG:dC base pairs (SI Fig. suggesting there is incomplete penetrance for the Ig deficiency 13 and SI Table 8). We also analyzed S␮-S␣ joints from IgAD phenotype associated with the L85F/P786S allele, consistent with patients carrying the associated MSH5 alleles, and the results the hypothesis that IgAD is a complex multigenic disease. mirrored those observed in CVID (SI Table 9). To investigate whether mutations in MSH5 couldalsobecon- Increased Microhomology and Lower Mutation Rate at S␮-S␣1 Joints tributing to alterations in the CSR process to other Ig isotypes, we of CVID Patients Carrying Disease Associated MSH5 Alleles. We next characterized S␮-S␥3 junctions from a group of CVID patients investigated whether patients with IgAD and CVID carrying the carrying MSH5 disease-associated alleles. No significant differences identified MSH5 alleles showed differences in Ig S joint phenotypes. in the length of microhomology were observed among the different S␮-S␣1 joints were amplified from peripheral blood DNA in three groups (SI Table 10). Of interest, and similar to the findings in groups of controls: healthy donors lacking MSH5 or TACI poly- S␮-S␣1 joints, the mutation rate across S␮-S␥3 joints was lower for morphisms (Control 1), healthy donors homozygous for rs3131378 both MSH5 L85F/P786S and DR3ϩ/ϩ patients compared with on the B8-DR3 extended haplotype (Control 2, DR3ϩϩ), and controls (SI Table 11). healthy donors heterozygous for the MSH5 L85F/P786S allele Finally, we examined the targeting of breakpoints to either (Control 3, L85F/P786S). These sequences were compared with pentamers or activation-induced cytidine deaminase hotspots and

7196 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0700815104 Sekine et al. Downloaded by guest on September 28, 2021 affinity to its heterodimerization partner MSH4. Similar to the phenotype observed in the mice, S␮-S␣1 joints from patients with associated MSH5 alleles showed increased microhomology, to- gether with a reduced mutation rate, an increased in-phase align- ment of pentamer repeats at the junctions, and targeting of S␮ breaks to pentamers (SI Fig. 15). Given these findings, what might be the mechanism by which Msh5 participates in the complex biochemistry of CSR? Msh5 has a well-characterized role in resolving Holliday junc- tions that form between homologous DNA strands during meiosis (8–10). Heterodimers of Msh5 and Msh4 are postulated to form a ‘‘sliding clamp’’ on DNA and serve as scaffolding for the recom- bination machinery including the DNA repair proteins Mlh1 and Pms2 (8). We envision a similar function for Msh4/5 during the early stages of intra-chromosomal ‘‘synapsis’’ of S␮ to S␥3orS␣, thereby facilitating the recruitment of proteins required for non- homologous end joining. An important observation relevant to the current data are the higher level of between S␮ and S␥3 regions than between S␮ and any of the other S regions in mice (1). Similarly, in humans the S␮ region shows high levels of overall homology with the S␣ region (Ϸ70%) and much less homology (Ϸ20%) with the various IgG S regions (31). We speculate that, given its potential contribution to IgAD, MSH5 in humans may have a specific role in facilitating CSR between S␮ and S␣. There is more IgA produced in the body than any other Ig isotype and this function of MSH5 may have evolved to ensure high-level IgA production for mucosal defenses. Another important question is the relationship between Msh5 and the long S joint microhomologies. S junctions with extended Fig. 4. Extended microhomology at B cell Ig switch joints of CVID patients microhomology are rarely found in the B cells of healthy individuals carrying associated alleles of MSH5.(A) Distribution of microhomology length and may reflect the activity of an alternative microhomology in S␮-S␣1 junctions from CVID patients and controls. Each dot represents the mediated end-joining (MMEJ) pathway, which uses homology length of microhomology of an independent S joint. The unlabeled group of searching and exonuclease activity to ligate homologous DNA controls lack any MSH5 nonsynonymous or DR3 alleles. MSH5 (L85F/P786S), strands with 3Ј or 5Ј overhangs. Increased S joint microhomology heterozygote for the L85F/P786S allele. DR3ϩϩ, homozygous for the is also found in humans with ATM and DNA Ligase IV missense rs3131378 SNP on the extended B8-DR3 MHC haplotype. TACI*, carrying one mutations (32, 33) and in Pms2Ϫ/Ϫ (7) and Mlh1Ϫ/Ϫ (5) mice, or more TACI missense mutations. Red, 0–1 bp microhomology; blue, 2–7 bp suggesting that these proteins may form a complementation group microhomology; orange, Ն 8 bp microhomology. P values were calculated by using Mann–Whitney tests. (B) The percentage of S joints where there was for CSR. If this group of genes is important for recruitment of the ‘‘in-frame’’ alignment of pentamer repeat units between germline S␮ and S␣1 nonhomologous end joining machinery, reduced function of these is represented. n ϭ number of joints sequenced and examined. Statistical proteins may result in a net loss in efficiency of CSR, as observed significance was determined by using two-tailed Fisher’s exact tests. in IgAD and CVID, and an increased dependence on microhomol- ogy-directed mechanisms for alignment and ligation of S joints. The low mutation rate noted in long microhomology S joints may reflect the pattern of alignment of pentamer repeats at the S junctions. In exonuclease activity that is required for MMEJ, which, we postu- controls, Ϸ50% of the S␮ and S␣1 breakpoints occurred within late, could ‘‘erase’’ the footprints of activation-induced cytidine pentamer repeats. In contrast, breakpoints from MSH5 L85F/ deaminase activity at the initial double-strand break, such that the P786S patients were significantly targeted to pentamer motifs at S␮ ligated ends of the resulting S regions are well upstream or Ϫ (82%, P ϭ 5.4 ϫ 10 3)(SI Table 7) but not at S␣1(SI Table 11). downstream from the initial site of double-strand break and in an Controls carrying the MSH5 L85F/P786S allele showed a similar area of reduced mutation frequency. We are also intrigued by the preferential targeting of S␮ breakpoints to pentamers (75%, P ϭ possibility that Msh5 may have anti-recombinational activity and 0.013) as observed in L85F/P786S patients. More significantly, we secondarily function to suppress MMEJ of S joints (SI Fig. 16). found that the vast majority of S␮-S␣1 junctions from MSH5 Many oncogenic chromosome translocations in the Ig S region L85F/P786S (95%) and DR3ϩϩ (100%) patients showed an contain short stretches of microhomology between the donor and ‘‘in-phase’’ alignment of pentamer motifs, whereas in control acceptor DNA strands (34, 35). Thus, active suppression of the junctions, only Ϸ50% of pentamer motifs were aligned (Fig. 4B and MMEJ pathway may reduce the number of chromosomal translo- SI Fig. 14). cations resulting from CSR. Interestingly, the frequency of malig-

nant B cell lymphomas is increased in patients with CVID (13). IMMUNOLOGY Discussion Antibody deficiencies were observed in congenic H-2b/b MRL/lpr We found that Ϸ70% of H-2b/b congenic MRL/lpr mice carrying a mice, but not in Msh4Ϫ/Ϫ or Msh5Ϫ/Ϫ mice, whereas long S joint hypomorphic Msh5 allele were deficient in serum IgG3, and microhomologies were found in each strain. MRL/lpr mice have nucleotide sequence analysis of Ig S junctions revealed increased strong spontaneous self-antigen driven B cell responses in vivo, with donor/acceptor microhomology. Similar long microhomologies at S antibody levels up to 10-fold higher than non-autoimmune animals junctions were observed in KO mice for Msh5 and Msh4. Further- (27). We speculate that the genetically based activated B cell more, we identified several alleles of MSH5 in humans that show phenotype in MRL/lpr mice accentuates the defects in CSR caused genetic association with CVID and IgAD, including one (L85F/ by the hypomorphic allele of Msh5 in the congenics. It seems likely P786S) where the encoded mutant protein showed reduced binding that the IgG3 deficiency in MRL/lpr H-2b/b mice reflects specific

Sekine et al. PNAS ͉ April 24, 2007 ͉ vol. 104 ͉ no. 17 ͉ 7197 Downloaded by guest on September 28, 2021 interactions between the hypomorphic H-2b Msh5 allele and other Methods genes in the MRL/lpr background. Mice. 129/Sv (H-2b), C57BL/6 (H-2b), BALB/c (H-2d), FVB (H-2q) As for the human studies, the nonsynonymous alleles of MSH5 and MRL/lpr (H-2k) mice were purchased from The Jackson identified in the current study are rare (Q292H and C580G) or Laboratory (Bar Harbor, ME). H-2b/b MRL/lpr mice were gener- uncommon (L85S/P786S), and thus our power to definitively ated as described in ref. 24. Msh5-deficient mice were described in conclude that these alleles contribute to immune deficiency based ref. 9 and were bred onto the FVB (H-2q) background for Ͼ15 on genetic data is limited. We believe it is important to interpret the generations. Bf-gene KO mice were provided by J. Thurman human genetic data under the hypothesis that IgAD and CVID are (University of Colorado, Boulder, CO) (28), and Msh4-gene KO complex genetic diseases. As shown here, Ig deficiencies were not mice, backcrossed to C57BL/6, were provided by W. Edelmann observed in controls heterozygous for MSH5 nonsynonymous (Albert Einstein College of Medicine, Bronx, NY) (11). alleles. However, there were subtle, yet significant, changes in S joint phenotypes in controls carrying the various MSH5 alleles: Human DNA Samples. DNA from 207 IgAD and 83 CVID Swedish decreased switch joint mutation rates in DR3ϩϩ controls and cases and 198 controls were collected at the Karolinska Institute increased targeting of S␮ breakpoints to pentamers in MSH5 (Stockholm, Sweden). Three IgAD and 102 CVID U.S. cases were L85F/P786S controls (SI Fig. 15 and SI Table 8). Furthermore, collected at the Mt. Sinai Hospital (New York, NY) and at the B8-DR3 is a common MHC haplotype in the Caucasian population NCI/NIH (Bethesda, MD). 488 healthy U.S. controls were selected (Ϸ10% allele frequency), yet the vast majority of DR3ϩϩ indi- from the New York Health Project (37). All U.S. patients and viduals are not immune deficient. We have not yet identified controls are of self-reported European-Caucasian ancestry. In- functional effects of the MSH5 allele carried on the A1-B8-DR3 formed consent was obtained from all subjects, and the studies were extended haplotype. Although there are seven unique SNPs within approved by human subjects research institutional review boards. the MSH5 gene found only on this extended haplotype, there are no nonsynonymous polymorphisms, and further work will be required MSH5 Sequencing and Genotyping. The 25 exons and 1 Kb of the to determine whether this allele is associated with altered splicing, promoter of MSH5 were sequenced in 63 CVID cases from the U.S. expression, folding or inducibility of MSH5 during CSR. Because of or Sweden, and 33 IgAD cases from Sweden at the Broad Institute. ࿝ the high level of linkage disequilibrium on the B8-DR3 haplotype, Automated sequence analysis software (SNP COMPARE) and it is currently not possible to rule out the potential role of additional manual examination was used to screen the sequencing files. genes on the haplotype contributing to CSR. Twenty-seven ‘‘high-quality’’ SNPs were identified, of which, 17 In summary, these data provide evidence that the contribution of were already described in dbSNP (v124), and 10 SNPs were new. Msh5 to Ig CSR is complex and likely regulatory. Antibody Additional sequencing was performed to resolve discrepancies or to fill-in missing data. Genotyping assays for the identified SNPs were deficiency was only observed in the congenic MRL/lpr mice, and performed by using both Taqman and Sequenom platforms. Primer inbred strains carrying hypomorphic alleles of Msh5 (e.g., C57BL/6, and probe sequences are shown in SI Table 12. Additional details 129/Sv) did not show Ig deficiencies, possibly due to balancing are provided in SI Materials and Methods. selection of the MHC region (36). In humans, the various mutant Mouse and human switch junction sequence alignments are MSH5 alleles identified may not be sufficient by themselves to cause also provided in SI Appendices 1 and 2. clinically significant antibody deficiencies and may be compensated by other genes that confer disease resistance. Thus, the disease state We thank the many patients and physicians for their contributions. These in IgAD and CVID is likely a result of complex interactions with studies were supported by Fundac¸a˜o para a Cieˆncia e Tecnologia, Portugal other susceptibility genes and possibly environmental factors, sim- Fellowship SFRH/BD/16281/2004 (to R.C.F.), National Institutes of Health ilar to other multigenic complex diseases in humans. Grants U19 AI067152 and AR043274, and the Swedish Research Council.

1. Min IM, Selsing E (2005) Adv Immunol 87:297–328. 22. Salzer U, Chapel HM, Webster AD, Pan-Hammarstrom Q, Schmitt-Graeff A, 2. Xu Y (2006) Nat Rev Immunol 6:261–270. Schlesier M, Peter HH, Rockstroh JK, Schneider P, Schaffer AA, et al. (2005) Nat 3. Svetlanov A, Cohen PE (2004) Exp Cell Res 296:71–79. Genet 37:820–828. 4. Wu X, Tsai CY, Patam MB, Zan H, Chen JP, Lipkin SM, Casali P (2006) J Immunol 23. Castigli E, Wilson SA, Garibyan L, Rachid R, Bonilla F, Schneider L, Geha RS 176:5426–5437. (2005) Nat Genet 37:829–834. 5. Schrader CE, Vardo J, Stavnezer J (2002) J Exp Med 195:367–373. 24. Sekine H, Graham KL, Zhao S, Elliott MK, Ruiz P, Utz PJ, and Gilkeson GS (2006) 6. Li Z, Scherer SJ, Ronai D, Iglesias-Ussel MD, Peled JU, Bardwell PD, Zhuang M, J Immunol 15:7423–7434. Lee K, Martin A, Edelmann W, Scharff MD (2004) J Exp Med 200:47–59. 25. Mathis DJ, Benoist C, Williams VE, 2nd, Kanter M, McDevitt HO (1983) Proc Natl 7. Ehrenstein MR, Rada C, Jones AM, Milstein C, Neuberger MS (2001) Proc Natl Acad Sci USA 80:273–277. Acad Sci USA 98:14553–14558. 8. Snowden T, Acharya S, Butz C, Berardini M, Fishel R (2004) Mol Cell 15:437–451. 26. Briere F, Bridon JM, Chevet D, Souillet G, Bienvenu F, Guret C, Martinez-Valdez 9. de Vries SS, Baart EB, Dekker M, Siezen, A., de Rooij DG, de Boer P, te Riele H H, Banchereau J (1994) J Clin Invest 94:97–104. (1999) Genes Dev 13, 523–31. 27. Theofilopoulos AN, Dixon FJ (1985) Adv Immunol 37:269–390. 10. Edelmann W, Cohen PE, Kneitz B, Winand N, Lia M, Heyer J, Kolodner R, Pollard 28. Taube C, Thurman JM, Takeda K, Joetham A, Miyahara N, Carroll MC, Dakhama JW, Kucherlapati R (1999) Nat Genet 21:123–127. A, Giclas PC, Holers VM, Gelfand EW (2006) Proc Natl Acad Sci USA 103:8084– 11. Kneitz B, Cohen PE, Avdievich E, Zhu L, Kane MF, Hou H, Jr, Kolodner RD, 8089. Kucherlapati R, Pollard JW, Edelmann W (2000) Genes Dev 14:1085–1097. 29. Klein U, Tu Y, Stolovitzky GA, Keller JL, Haddad J, Jr, Miljkovic V, Cattoretti G, 12. Burrows PD, Cooper MD (1997) Adv Immunol 65:245–276. Califano A, Dalla-Favera R (2003) Proc Natl Acad Sci USA 100:2639–2644. 13. Cunningham-Rundles C, Bodian C (1999) Clin Immunol 92:34–48. 30. Yi W, Wu X, Lee TH, Doggett NA, Her C (2005) Biochem Biophys Res Commun 14. Vorechovsky I, Zetterquist H, Paganelli R, Koskinen S, Webster AD, Bjorkander 332:524–532. J, Smith CI, Hammarstrom L (1995) Clin Immunol Immunopathol 77:185–192. 31. Dunnick W, Hertz GZ, Scappino L, Gritzmacher C (1993) Nucleic Acids Res 15. Hammarstrom L, Smith CI (1983) Tissue Antigens 21:75–79. 21:365–372. 16. Olerup O, Smith CI, Hammarstrom L (1990) Nature 347:289–290. 32. Pan Q, Petit-Frere C, Lahdesmaki A, Gregorek H, Chrzanowska KH, Ham- 17. Schaffer FM, Palermos J, Zhu ZB, Barger BO, Cooper MD, Volanakis JE (1989) marstrom L (2002) Eur J Immunol 32:1300–1308. Proc Natl Acad Sci USA 86:8015–8019. 33. Pan-Hammarstrom Q, Jones AM, Lahdesmaki A, Zhou W, Gatti RA, Ham- 18. Alper CA, Marcus-Bagley D, Awdeh Z, Kruskall MS, Eisenbarth GS, Brink SJ, marstrom L, Gennery AR, Ehrenstein MR (2005) J Exp Med 201:189–194. Katz AJ, Stein R, Bing DH, Yunis EJ, Schur PH (2000) Tissue Antigens 56:207–216. 19. Vorechovsky I, Webster AD, Plebani A, Hammarstrom L (1999) Am J Hum Genet 34. Ramiro AR, Jankovic M, Eisenreich T, Difilippantonio S, Chen-Kiang S, Mura- 64:1096–1109. matsu M, Honjo T, Nussenzweig A, Nussenzweig MC (2004) Cell 118:431–438. 20. Vorechovsky I, Cullen M, Carrington M, Hammarstrom L, Webster AD (2000) 35. Weinstock DM, Elliott B, Jasin M (2006) Blood 107:777–780. J Immunol 164:4408–4416. 36. Hughes AL, Yeager M (1998) Annu Rev Genet 32:415–435. 21. Grimbacher B, Hutloff A, Schlesier M, Glocker E, Warnatz K, Drager R, Eibel H, 37. Mitchell MK, Gregersen PK, Johnson S, Parsons R, Vlahov D (2004) J Urban Health Fischer B, Schaffer AA, Mages HW, et al. (2003) Nat Immunol 4:261–268. 81:301–310.

7198 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0700815104 Sekine et al. Downloaded by guest on September 28, 2021