discovery through in vitro directed evolution of consensus recognition epitopes

John T. Ballewa,b, Joseph A. Murrayc, Pekka Collind, Markku Mäkie,f, Martin F. Kagnoffg,h, Katri Kaukinend,e,i, and Patrick S. Daughertya,b,1

aDepartment of Chemical Engineering and bCenter for Bioengineering, Biomolecular Science and Engineering Program, University of California, Santa Barbara, CA 93106; cDivision of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN 55905; dDepartment of Gastroenterology and Alimentary Tract Surgery, Tampere University Hospital, FIN-33520, Tampere, Finland; eSchool of Medicine, University of Tampere, FIN-33520, Tampere, Finland; fTampere Center for Child Health Research, University of Tampere and Tampere University Hospital, FIN-33520, Tampere, Finland; gLaboratory of Mucosal Immunology, Department of Medicine and hDepartment of Pediatrics, University of California, San Diego, La Jolla, CA 92093; and iDepartment of Medicine, Seinäjoki Central Hospital, FIN-60220, Seinäjoki, Finland

Edited by K. Christopher Garcia, Stanford University, Stanford, CA, and approved October 17, 2013 (received for review August 5, 2013) To enable discovery of serum indicative of disease and their clinical development (11–13). Although approved antibody- simultaneously develop reagents suitable for diagnosis, in vitro based diagnostic assays often exhibit sensitivity and/or specificity directed evolution was applied to identify consensus peptides rec- values in excess of 95% (14, 15), library isolated peptides that ognized by patients’ serum antibodies. Bacterial cell-displayed mimic antigens (mimotopes), used alone or in combination, peptide libraries were quantitatively screened for binders to se- rarely meet these stringent requirements. For example, peptides rum antibodies from patients with celiac disease (CD), using cell- from RPLs selected against serum antibodies from patients with sorting instrumentation to identify two distinct consensus epitope Crohn’s disease (16), multiple sclerosis (12, 17, 18), celiac dis- fi E Y families speci c to CD patients (PEQ and /DxFV /FQ). Evolution of ease (11, 13), (19), or type-1 diabetes (20– E Y fi fi the /DxFV /FQ consensus epitope identi ed a celiac-speci c epi- 22) have exhibited insufficient diagnostic accuracy. Although tope, distinct from the two CD hallmark antigens tissue transglu- these studies have provided support for continued investigation taminase-2 and deamidated gliadin, exhibiting 71% sensitivity of antibodies as candidate , they have not yielded clini- and 99% specificity (n = 231). Expansion of the first-generation cally efficacious diagnostic reagents. Consequently, there remains PEQ consensus epitope via in vitro evolution yielded octapeptides a need for discovery processes to produce antibody detection QPEQAFPE and PFPEQxFP that identified ω- and γ-gliadins, and reagents exhibiting accuracies desired for clinical development. their deamidated forms, as immunodominant B-cell epitopes in Although antibody profiling methods using RPLs, including wheat and related cereal proteins. The evolved octapeptides, but phage and bacterial display, lend themselves to various in vitro not first-generation peptides, discriminated one-way blinded CD directed evolution protocols, this capability has not been n = and non-CD sera ( 78) with exceptional accuracy, yielding 100% exploited using blood specimens from patients. Given this, we fi sensitivity and 98% speci city. Because this method, termed anti- applied bacterial display peptide libraries to first screen for dis- body diagnostics via evolution of peptides, does not require prior ease-specific antibody binding peptides and subsequently to knowledge of pathobiology, it may be broadly useful for de novo evolve peptides to achieve diagnostically useful levels of sensi- discovery of antibody biomarkers and reagents for their detection. tivity and specificity. We selected celiac disease (CD) as a model disease because two distinct antibody specificities, transglutaminase he diagnosis of many diseases relies heavily upon the accu- 2 (TG2) and deamidated gliadin, have been characterized ex- Tracy of antibody detection. Assays to detect antibodies using tensively (23) and serve as clinically important antibody bio- known antigens are used extensively to diagnose infectious and markers. Our results demonstrate that in vitro directed evolution autoimmune diseases. And antibodies exhibiting unique antigen- binding patterns have been shown to occur in diverse human Significance diseases, including oncological (1), inflammatory (2), and neu- rological and psychiatric disorders (3). The utility of antibodies The diagnosis of many diseases is dependent upon accurate in diagnostics derives from their intrinsic affinity and specificity, detection of particular antibodies present in blood. However, biochemical stability, and abundance in blood. Nevertheless, the fi fi the development of biochemical reagents that can reliably identi cation of rare antibody speci cities indicative of disease detect these antibodies has proved remarkably challenging. and the development of reagents for their accurate detection fi This study describes a process to create biochemical reagents have proved exceptionally dif cult (4). Intersubject variability of that can accurately and reliably detect disease-associated anti- antibody specificities is a major challenge to the development of fi bodies, without requiring knowledge of the cause or mechanisms accurate tests. Speci cally, individual genetic and stochastic of disease. Simultaneously, this process enabled identification variations that shape the antibody repertoire introduce hetero- of a critical environmental agent involved in celiac disease. geneity in disease antibody subpopulations (polyclonal variation, fi fi Thus, the process presented here may enable the development speci city, af nity, and titer) that hinders uniform antibody de- of effective diagnostic tests for other medical conditions where tection (5, 6). such tests are lacking and the identification of environmental Random peptide libraries (RPLs) have been proposed as factors involved in disease. a potential source of diagnostic reagents capable of mimicking – diverse biological antigens in the environment (7 9). Individual Author contributions: J.T.B., J.A.M., M.F.K., and P.S.D. designed research; J.T.B. performed peptides identified from RPLs using patient sera have been ca- research; J.A.M., P.C., M.M., and K.K. contributed new reagents/analytic tools; J.T.B. and P.S.D. pable of identifying patients with disease with modest accuracy analyzed data; and J.T.B., M.F.K., and P.S.D. wrote the paper. (9, 10). Diagnostic accuracy can be improved in some cases, The authors declare no conflict of interest. using panels of library-isolated peptides coupled with statistical This article is a PNAS Direct Submission. classification algorithms (11), with the drawback of requiring Freely available online through the PNAS open access option. multiple independent measurements. Despite these advances, 1To whom correspondence should be addressed. E-mail: [email protected]. fi peptides identi ed from random libraries have exhibited in- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. sufficient diagnostic efficacy (sensitivity and specificity) to foster 1073/pnas.1314792110/-/DCSupplemental.

19330–19335 | PNAS | November 26, 2013 | vol. 110 | no. 48 www.pnas.org/cgi/doi/10.1073/pnas.1314792110 Downloaded by guest on September 28, 2021 can be applied for de novo generation of reagents that exhibit groups (n = 3 subjects per group) were used only once for library requisite levels of diagnostic sensitivity and specificity for clinical enrichment to favor peptides cross-reactive with antibodies from translation. Finally, our results raise the intriguing possibility that many patients with CD. The X6PEQX6 library was enriched for in vitro evolution of such diagnostic reagents may provide a route IgG- and IgA-specific binders, but IgG binders were more rap- to identify previously unknown environmental antigens involved in idly enriched and cross-reactive to multiple CD groups in com- disease and thereby elucidate pathobiology mechanisms. parison with IgA binders; thus, our subsequent analysis focused on IgG isotype reactivity. From the enriched library population, Results three highly represented consensus motifs were observed: Discovery of Celiac Disease-Specific Peptide Epitopes. A Bacterial dis- PEQxFP, PEQPL, and /VFPEQ (Fig. 2A). To assess the di- play random peptide libraries of the form X15,X12CX3,and agnostic sensitivity and specificity of individual peptides, the re- fl X4CX7CX4 were screened using uorescence-activated cell activity of one representative clone from each motif group was sorting (FACS). For screening, individual patient sera were measured using CD case (n = 18) and non-CD control sera (n = 5) pooled into three groups of CD cases and three groups of non- not used for screening. The PEQxFP motif derived peptide CD sera [i.e., healthy and gastrointestinal (GI)-illness control VWDRGVPEQMFPRKG reacted with 18/18 CD sera, whereas subjects], with each group composed of sera pooled from eight VAWTMGPEQPLVRAL reacted with 11/18, and GQGQAF- subjects. Alternating rounds of library enrichment were per- PEQGSVPIN reacted with 14/18. None of the peptides were re- formed with CD sera using FACS and subtraction with non-CD active with control sera. To increase the information content and sera using magnetic cell sorting (MACS) (Fig. 1). To determine diagnostic performance of the most reactive consensus motif, whether enriched library members were specific for sera from a second cycle of epitope expansion was performed. Thus, a library CD groups and thereby guide screening, flow cytometry was of the form X PEQXFPX composed of 108 members was applied to quantitatively measure reactivity levels after each cy- 5 4 screened as above, using sera dilutions of 1:500 and 1:1,000. cle of sorting (Fig. S1). Libraries were sorted independently fi Epitopes identified from the final screening cycle exhibited an based on isotype-speci c reactivity, using anti-IgG, anti-IgA, and A evolved consensus of PFPEQxFP, AFPEQxFP, or QPEQ /SFPE anti-IgM secondary reporters. Alternating cycles of enrichment/ A subtraction resulted in large reactivity differences between (Fig. 2 ). Collectively, the entire set of peptides obtained from the pooled CD and non-CD sera for IgA and IgG, but not IgM, second focused library exhibited the evolved consensus dodecamer P Q E C fi binding peptides (Fig. S2 A and B). Peptide sequences from IgG sequence PxE /A /FPEQxFP /D (Fig. 2 ), after adjusting the nal and IgA isotype-specific library screening revealed two prevalent position for the overrepresentation of arginine that results from – epitopes among 195 clones: PEQ and DxFVF/ Q (Fig. 2A and random-codon generated RPLs. To assess whether epitope evo- Y fi fi Table S1). Peptides with the PEQ tripeptide emerged from both lution improved the sensitivity and speci city of the identi ed F peptide epitopes, four to five clones from each PEQ motif group linear and constrained libraries, whereas those with DxFV /YQwere identified almost exclusively from the constrained library pool. (Fig. 2A) were pooled and assayed for reactivity with pooled sera from five patients with CD or non-CD subjects. Pooled clones from In Vitro Evolution of CD-Specific Peptides. To improve the reactivity each expansion cycle exhibited increased reactivity (P < 0.0001) of and consensus between first-generation peptides, a focused with sera from patients with CD and decreased reactivity with library of the form X6PEQX6 was screened as above. Pooled sera non-CD sera (P < 0.0001) (Fig. 2B), demonstrating that epitope SCIENCES APPLIED BIOLOGICAL

Fig. 1. Library screening algorithm to identify and evolve antibody-detecting peptides. A bacterial display random peptide library is subjected to repeated cycles of enrichment and subtraction with a sequence of pooled sera from CD groups or non-CD groups. Using consensus information from the primary li- brary, a second-generation library is constructed and similarly screened with a new set of CD and non-CD sera.

Ballew et al. PNAS | November 26, 2013 | vol. 110 | no. 48 | 19331 Downloaded by guest on September 28, 2021 Fig. 2. Directed evolution of antibody-detecting peptides increases their sensitivity and specificity. (A) Sequences of individual peptides from the three most abundant consensus groups in each cycle of epitope evolution. See SI Materials and Methods for a complete list. (B) Bacterial clones expressing the PEQ- related peptides in A, Upper were pooled and assessed for IgG reactivity to five CD and five non-CD sera groups. Shown is a box-and-whiskers plot of the reactivity (fluorescence intensity) of each CD and non-CD sera group. The median value is plotted as a line with each box displaying the distribution of the inner quartiles, with whiskers showing the upper and lower quartiles (all differences are statistically significant, P < 0.0001). (C and D) Evolved consensus epitopes for (C) PEQ motif and (D) CSE generated using WebLOGO3.0.

expansion increased the diagnostic sensitivity and specificity of and S5), sera from a second cohort of CD cases and controls (n = the identified peptides. Thus, in vitro directed evolution yielded 78) were assayed in a one-way blinded test. Cases (35/38) were peptide epitopes specifically recognized by IgG antibodies of positive for TG2 and/or endomysial antigen serology with partial patients with CD. or total villous atrophy. Of the remaining 3 cases, 2 had total F To evolve the DxFV /YQ epitope, a second-generation library villous atrophy with negative or unavailable serology (Tables S7– D Y of the form X6 /ExFV /FQCX4 was screened. This library was S9). All control sera were from healthy donors negative for TG2 more readily enriched for IgA, rather than IgG binders. Addi- IgA. Two peptides (DGP3, RGRAQPEQAFPESVG; and DGP6, tional consensus residues emerged within the randomized region GPQPFPEQLFPDPFR) exhibiting high sensitivity and specific- and cysteine-constrained epitope variants were preferred, including ity in a preliminary set of 10 CD and 10 non-CD sera were S F S/ F F CRD /TFV /YQC, RCxD TFV /YQC, and DCFV /YQC (Fig. assayed for IgG reactivity, and a diagnostic cutoff was established 2A and Table S2). Similarly, screening of a linear third-generation using the individual patient reactivity dataset. Epitope DGP3 S A F fi library of the form X6D /T/ FV /YQX4 identified a preference correctly identi ed 100% of CD cases (38/38) and 97.5% (39/40) F fi for cyclic peptides having the consensus CEDSFV /YQC (Fig. of non-CD controls; epitope DGP6 correctly identi ed 92.1% of 2D) and nonconstrained linear epitopes with the consensus CD cases (35/38) and 97.5% (39/40) of non-CD controls (Fig. S F A ΩD /TFV /YQ, where Ω = [L/I/M/F/E] (Table S2). Importantly, 3 ). For comparison, a commercially available Quanta Lite DGP the unique celiac-specific epitope (CSE) was not a mimic of TG2 IgG assay, using a cutoff value of 10 units, achieved 98% sensitivity or deamidated gliadin (DGP) because antibody titers against these and 100% specificity (Fig. 3B). Furthermore, assay results with CD antigens were unaffected by depletion of antibodies binding epitope DGP3 correlated with those obtained using Quanta Lite to the unique epitope (Fig. S3 A–C). Given the weak consensus (Fig. 3C). Thus, a single peptide generated using sequential epi- S /C F at the Ω position, the degenerate search motif D /T FV /YQ tope expansion performed equivalently to a proprietary, Food and was used along with ScanProsite to identify a panel of candidate Drug Administration-approved diagnostic assay. antigens (Table S3). To determine prevalence of anti-CSE antibodies in pa- tients with CD and control subjects, CD and non-CD sera Evolved Peptide Epitopes Exhibit High Diagnostic Sensitivity and (n = 231) were assessed for reactivity to the CSE peptide: Specificity. To evaluate the diagnostic utility of expanded pep- MDVRCRDSFVYQCHVGT. Overall, the CSE peptide ex- tide epitopes from one cohort of cases and controls (Tables S4 hibited 71% (65/92) sensitivity and 99% (2/139) specificity (Fig.

19332 | www.pnas.org/cgi/doi/10.1073/pnas.1314792110 Ballew et al. Downloaded by guest on September 28, 2021 Fig. 3. Diagnostic assay enabled by ADEPt. (A and B) Measurement of blinded patient sera (n = 78) for IgG reactivity using (A)DGP3(Left) and DGP6 (Right) and (B) Quanta Lite. (C) Assay results using ADEPt DGP3 epitope correlate with those obtained using Quanta Lite (Spearman’s coefficient, ρ = 0.89). (D) Serum IgA antibody reactivity to DSFVYQ epitope in 231 patient samples. (E) Matched sera from patients with CD before and after 1 y of GFD exhibit decreased reactivity to DSFVYQ.

3D). To determine whether the serum antibody titer against sequences. Searches performed with the third-generation motif the CSE epitope dissipated after the introduction of a gluten- PFPEQxFP also identified a diverse group of prolamins from wheat,

free diet (GFD), sera from 11 CD cases obtained at time of barley, and rye (Fig. 4B). The third-generation motifs were identical SCIENCES diagnosis or after 1 y on a GFD were assayed. Patients with to the prolamin epitopes that, in CD, result from posttranslational active CD (10/11) were reactive and 8/11 of these patients deamidation of glutamine to glutamic acid (Q→E) by TG2. Col- APPLIED BIOLOGICAL exhibited reduced, but nonzero, levels of epitope reactivity lectively, these results demonstrate that the in vitro directed evolu- after a GFD (Fig. 3E); all patients were seronegative for TG2 tion of epitopes can facilitate discovery of nonself antigens. and DGP antibodies after a GFD. Together, these results sug- gest the CD-specific peptide is derived from an antigen distinct Discussion from TG2 and DGP epitopes. The antibody diagnostics via evolution of peptides (ADEPt) method presented here provides an effective route to evolve Directed Evolution of Peptide Epitopes Facilitates Nonself Antigen diagnostically efficacious peptides for de novo biomarker dis- Discovery. Due to the substantially increased information con- covery and detection without knowledge of disease pathobiology. tent within the third-generation evolved consensus epitopes Previous methods to discover peptides binding to disease anti- (QPEQAFPE, PFPEQxFP) compared with the first-generation bodies, including antibody profiling and signature analysis using epitope PEQ, we reasoned that evolved epitopes might enable peptide libraries (7, 24), have demonstrated the existence of unbiased antigen identification within the entire protein data- unique antibody specificities in a broad range of diseases (25). base. Unbiased BLASTp searches of the epitopes QPEQAFPE And although the peptides identified have demonstrated di- and PFPEQxFP directly identified cereal grain proteins from the agnostic potential, alone or in panel format (8, 25), their trans- genus Triticeae, including gliadins, hordeins, and secalins (Fig. 4). lation to the clinic has been hindered by inadequate diagnostic For comparison, an identical search using the first- and second- sensitivity and specificity values. By applying concepts from in generation motifs PEQ and PEQxFP yielded an excessive num- vitro directed evolution to human patient samples, we were able ber of unrelated hits and did not enable antigen discovery. The to screen large libraries in an iterative fashion for molecular highest-scoring antigen, obtained using the epitope consensus properties (affinity, cross-reactivity, and molecular specificity) QPEQAFPE, was ω-gliadin from wheat (Fig. 4A). Similarly, use that favor diagnostic sensitivity and specificity. In agreement with of the aggregate (i.e., using all sequences) consensus epitope many prior studies, our results demonstrate that a RPL, in the Q from third-generation peptides (PxEP /FPEQxFPE; Fig. 2C) absence of directed evolution, is insufficient to identify peptides identified exclusively ω-gliadins among the 25 highest-scoring with optimal diagnostic efficacy. Only when the peptide search

Ballew et al. PNAS | November 26, 2013 | vol. 110 | no. 48 | 19333 Downloaded by guest on September 28, 2021 Here, environmental (i.e., nonhuman) protein antigens rec- ognized by CD-specific antibodies were unambiguously identified using ADEPt. Multiple methods have been developed to identify candidate autoantigens, including synthetic peptide and peptoid arrays (3), whole-protein antigen arrays (1), and human cDNA or peptidome libraries (26). In contrast, methods to identify nonhuman antigens mostly closely associated with disease have not been reported. The rapidly expanding protein database, currently composed of more than 31 million protein sequences, is simply too large to enable database searching using the limited consensus data arising from a first-generation RPL. Epitope expansion using ADEPt dramatically reduced the frequency of antigen candidates within the nonredundant protein database, enabling precise identification of immunodominant B-cell epit- opes (ω-gliadin, γ-gliadin, and B-hordein). Interestingly, the im- munodominant B-cell epitopes were highly similar to recently elucidated immunodominant T-cell epitopes (27). We did not observe linear B-cell epitopes derived from the CD-specific autoantigen TG2, which is consistent with the proposed exis- tence of immundominant structural epitopes within TG2 (28). However, we cannot rule out the possibility that lower-abundance linear epitopes or structural mimotopes were enriched during li- S Y brary screening but outcompeted by DGP and D /TFV /FQ peptides. Future efforts using next-generation sequencing and bioinformatic tools may permit identification and characteri- zation of a greater number and variety of disease-associated peptide epitopes. Application of ADEPt to sera from patients with CD identi- fied a previously unreported CSE. Antibodies binding CSE S Y peptides with the consensus motif CXD /TFV /FQC were pres- ent in 71% of patients with CD from geographically distinct cohorts and exhibited equivalent specificity (∼99%) for CD compared with gold-standard antibody biomarkers of CD (anti- TG2 IgA, anti-endomysial antibodies, and anti-DGP IgG). The sensitivity and specificity values observed with CSE are signifi- cant because many distinct antibodies have been reported to be present in patients with CD but the same specificities have been observed in unrelated disorders (29, 30). In contrast, the anti- CSE antibody specificity occurred exclusively within subjects with CD (29). The observation that anti-CSE antibody titers signifi- cantly decrease in matched sera from patients pre- and 1 y post- GFD further supports the disease specificity of this antibody specificity. Although the precise identity of the antigen mimicked by CSE remains to be elucidated, the ability of the evolved consensus epitope to narrow our search to <40 candidate anti- gens suggests that antigen discovery will be possible. Although these data highlight the need for an unbiased strategy to down- select candidate antigens, it is interesting to note the presence of human commensals and pathogens (Prevotella, Roseburia, Lac- tobacillus, Bacteroides, Vibrio, Burkholderia, Giardia, and Bacil- lus) and the common wheat fungal pathogen (Puccinia) among Fig. 4. Protein antigens containing evolved epitope PEQ motif. (A and B) fi the candidates. Systematic analysis of fragments containing the Proteins and organisms identi ed by query of (A)QPEQAFPEand(B)PFPEQXFP epitope from each candidate antigen for their sensitivity and against the nonredundant protein database, using BLASTp (PAM30 Matrix) fi and rank ordered by total score. speci city may provide a means to uncover the antigen that gave rise to this antibody specificity. Finally, the well-established sig- nificance of DGP antibodies and TG2 to the space was expanded through directed evolution were we able to pathobiology of CD suggests that confirmation of the antigen achieve accuracies comparable to gold-standard diagnostics for corresponding to CSE may provide additional clues regarding CD. Thus, it may be possible to improve the diagnostic utility of the mechanisms of CD pathogenesis. previously reported peptides arising from RPLs using ADEPt. In summary, we present ADEPt as a method enabling the si- multaneous discovery of antibody biomarkers of disease and Although we concluded the directed evolution process after reagents for their sensitive and specific detection. In principle, screening the third-generation focused epitope library wherein ADEPt could be applied to a variety of antibody-containing fi sensitivity and speci city were maximized (100%, 98%), further specimens (e.g., serum/plasma, cerebrospinal fluid, urine, and cycles of directed evolution could enhance the dynamic range saliva). Given the ubiquitous nature of antibody repertoire between CD and non-CD signals. In short, our results demonstrate changes observed in diverse diseases, ADEPt may be useful to the potentially broad utility of directed evolution in the context create diagnostics for early disease detection, stratification, and of and diagnostics development. therapeutic monitoring (31). And finally, because this method is

19334 | www.pnas.org/cgi/doi/10.1073/pnas.1314792110 Ballew et al. Downloaded by guest on September 28, 2021 not constrained to searches for autoantigens, ADEPt may be unbound supernatant (washing three times). The pellet was then resuspended useful to reveal previously unknown environmental factors in- in the respective 1:500 biotinylated goat anti-human secondary antibody (IgA/ volved in disease. IgG/IgM) in 1× PBST at 4 °C for 45 min followed by centrifugation and removal of unbound supernatant (washing two times). For tertiary labeling, the pellet Materials and Methods was resuspended in 15 nM SA-phycoerythrin in 1× PBST at 4 °C for 45 min, followed by another wash via centrifugation and resuspension (washing three Bacterial Display Peptide Library Screening. Bacterial display peptide libraries times) in ice-cold PBST at a volume between 107 and 108 cells/mL. Resuspended of the form X15,X4CX7CX4,orX13CX2 were screened using FACS and MACS to identify peptides binding to antibodies in sera from patients with CD but not cells were analyzed using a FACSAria cell sorter (Becton Dickinson), using 488- fi to those in non-CD sera (Tables S4 and S5). The library was depleted of non- nm excitation. After sorting, retained cells were ampli ed for further rounds CD (i.e., healthy and GI-illness controls) antibody-binding peptides, using of sorting by overnight growth and plated to isolate single clones. MACS. A frozen aliquot of each library containing 20 times the expected diversity was inoculated into 500 mL LB (10 g tryptone, 5 g yeast extract, Epitope Evolution by Cytometric Screening. Second-generation libraries were and 10 g/L NaCl) supplemented with 34 μg/mL chloramphenicol (Cm) and constructed of the form X6PEQX6 and X6[E/D]XFV[YF]QCX4 on the N terminus

grown to OD600 = 0.5 at 37 °C with vigorous shaking (250 rpm). Protein of eCPX, using degenerate NNS oligonucleotides (32) (Table S6), resulting in 8 8 expression was induced by addition of L(+)-arabinose to a final concentra- an estimated library diversity of 2 × 10 and 1 × 10 members, respectively. A 10 tion of 0.02% wt/vol with shaking at 37 °C for 1 h. Cells (2.5 × 10 ) were third-generation library of the form X5PEQXFPX4 and X4D[STA]FV[YF]QX5 centrifuged (3,000 × g, 4 °C, 10 min) and resuspended in cold phosphate was similarly constructed (Table S6). Directed library screening was per- buffered saline with Tween 20 (PBST). To deplete the library of streptavidin- formed as above except that unique nonrepeating pools of sera from and protein A/G-binding clones, washed streptavidin-conjugated beads and patients with CD (n = 3 subjects per pool) were used for each round of protein A/G beads were added to a ratio of one bead per 50 cells, and the enrichment such that no pool was used more than once and a nonrepeating mixture was incubated 45 min at 4 °C on an inversion shaker. A magnet was non-CD control pool (n = 3–5 subjects per pool) was used for each round of then applied to the tube for 5 min and the unbound cells in the supernatant subtraction. In an effort to expand upon the known antigenic sequence, were recovered. To deplete the library of secondary antibody-binding pep- third-generation libraries were screened using 1:500 and 1:1,000 pooled tides, a 1:500 dilution of secondary antibody was incubated with the cells disease sera. PEQ focused libraries were screened by IgG isotype and followed by incubation with streptavidin (SA) beads and removal by magnet F DXFV /YQ libraries by IgA isotype. similar to SA-binding peptide removal. Subtractive MACS steps for removal of nonspecific serum antibody-binding peptides were performed in a similar Additional Methods. Additional descriptions of reagents and methods, in- manner to that of SA and protein A/G depletions except that before in- cluding cohort information, sample handling and preparation, clone re- cubation with biotinylated secondary antibody or beads, the library was first activity assays, antigen mimicry assays, and protein database queries for incubated with 1:100 pooled non-CD sera (n = 8 subjects per pool) for 45 min candidate antigen identification, are available in SI Materials and Methods. at 4 °C, followed by washing two times with PBST. For positive selection, pooled CD sera 1:100–1:200 (n = 8 subjects per pool) were added to the li- brary and incubated for 45 min. Magnetic separation was used to wash the ACKNOWLEDGMENTS. This work was supported by the National Institutes of Health Grants AI09224 and DK080395 (to P.S.D.); DK057892 (to J.A.M.); and beads three times with PBST, and the pellet was resuspended in LB with Cm fl DK35108; and a grant from the William K. Warren Foundation (to M.F.K.). The and 0.2% glucose (wt/vol) for overnight growth. For ow cytometric analysis Celiac Disease Study Group has been financially supported by the Academy of fi and sorting, induced cells corresponding to ve times the estimated Finland, the Sigrid Juselius Foundation, and the Competitive State Research remaining clonal diversity were incubated with 1:100–1:200 dilution of Financing of the Expert Responsibility Area of Tampere University Hospital pooled sera for 45 min at 4 °C followed by centrifugation and removal of (Grants 9H166, 9P020, and 9P033).

1. Anderson KS, et al. (2011) Protein microarray signature of biomarkers 17. Cortese I, et al. (1998) Identification of peptides binding to IgG in the CSF of multiple for the early detection of breast cancer. J Proteome Res 10(1):85–96. sclerosis patients. Mult Scler 4(1):31–36. 2. Lewis JD (2011) The utility of biomarkers in the diagnosis and therapy of in- 18. Fujimori J, et al. (2011) Epitope analysis of cerebrospinal fluid IgG in Japanese mul- flammatory bowel disease. Gastroenterology 140(6):1817–1826.e2. tiple sclerosis patients using phage display method. Mult Scler Int 2011:353417. 3. Reddy MM, et al. (2011) Identification of candidate IgG biomarkers for Alzheimer’s 19. Dybwad A, Førre O, Natvig JB, Sioud M (1995) Structural characterization of peptides disease via combinatorial library screening. Cell 144(1):132–142. that bind synovial fluid antibodies from RA patients: A novel strategy for identifi- 4. Fritzler MJ (2008) Challenges to the use of autoantibodies as predictors of disease cation of disease-related epitopes using a random peptide library. Clin Immunol Im-

onset, diagnosis and outcomes. Autoimmun Rev 7(8):616–620. munopathol 75(1):45–50. SCIENCES 5. Sherer Y, Gorstein A, Fritzler MJ, Shoenfeld Y (2004) Autoantibody explosion in sys- 20. Mennuni C, et al. (1997) Identification of a novel type 1 diabetes-specific epitope by

temic lupus erythematosus: More than 100 different antibodies found in SLE patients. screening phage libraries with sera from pre-diabetic patients. J Mol Biol 268(3): APPLIED BIOLOGICAL – Semin Arthritis Rheum 34(2):501–537. 599 606. 6. Huizinga TW, et al. (2005) Refining the complex rheumatoid arthritis phenotype 21. Mennuni C, et al. (1996) Selection of phage-displayed peptides mimicking type 1 di- fi – based on specificity of the HLA-DRB1 shared epitope for antibodies to citrullinated abetes-speci c epitopes. J Autoimmun 9(3):431 436. 22. Bason C, et al. (2013) In type 1 diabetes a subset of anti-coxsackievirus B4 antibodies proteins. Arthritis Rheum 52(11):3433–3438. recognize autoantigens and induce apoptosis of pancreatic beta cells. PLoS ONE 8(2): 7. Cortese R, et al. (1994) Epitope discovery using peptide libraries displayed on phage. e57729. Trends Biotechnol 12(7):262–267. 23. Kagnoff MF (2007) Celiac disease: Pathogenesis of a model immunogenetic disease. 8. Kouzmitcheva GA, Petrenko VA, Smith GP (2001) Identifying diagnostic peptides for J Clin Invest 117(1):41–49. lyme disease through epitope discovery. Clin Diagn Lab Immunol 8(1):150–160. 24. Restrepo L, Stafford P, Johnston SA (2013) Feasibility of an early Alzheimer’s disease 9. Bartoli F, et al. (1998) DNA-based selection and screening of peptide ligands. Nat immunosignature diagnostic test. J Neuroimmunol 254(1-2):154–160. Biotechnol 16(11):1068–1073. 25. Fierabracci A (2009) Unravelling autoimmune pathogenesis by screening random 10. Osman AA, et al. (2000) B cell epitopes of gliadin. Clin Exp Immunol 121(2):248–254. peptide libraries with human sera. Immunol Lett 124(1):35–43. 11. Spatola BN, Murray JA, Kagnoff M, Kaukinen K, Daugherty PS (2013) Antibody rep- 26. Larman HB, et al. (2011) Autoantigen discovery with a synthetic human peptidome. ertoire profiling using bacterial display identifies reactivity signatures of celiac dis- Nat Biotechnol 29(6):535–541. – ease. Anal Chem 85(2):1215 1222. 27. Tye-Din JA, et al. (2010) Comprehensive, quantitative mapping of T cell epitopes in fi fi fl 12. Cortese I, et al. (1996) Identi cation of peptides speci c for cerebrospinal uid an- gluten in celiac disease. Sci Transl Med 2(41):41ra51. tibodies in multiple sclerosis by using phage libraries. Proc Natl Acad Sci USA 93(20): 28. Simon-Vecsei Z, et al. (2012) A single conformational transglutaminase 2 epitope – 11063 11067. contributed by three domains is critical for celiac antibody binding and effects. Proc 13. Zanoni G, et al. (2006) In celiac disease, a subset of autoantibodies against trans- Natl Acad Sci USA 109(2):431–436. glutaminase binds toll-like receptor 4 and induces activation of monocytes. PLoS Med 29. Alaedini A, Green PH (2008) Autoantibodies in celiac disease. Autoimmunity 41(1): 3(9):e358. 19–26. 14. Leffler DA, Schuppan D (2010) Update on serologic testing in celiac disease. Am J 30. D’Angelo S, et al. (2013) Profiling celiac disease antibody repertoire. Clin Immunol Gastroenterol 105(12):2520–2524. 148(1):99–109. 15. van Venrooij WJ, van Beers JJ, Pruijn GJ (2011) Anti-CCP antibodies: The past, the 31. Roep BO, Buckner J, Sawcer S, Toes R, Zipp F (2012) The problems and promises of present and the future. Nat Rev Rheumatol 7(7):391–398. research into human immunology and . Nat Med 18(1):48–53. 16. Saito H, et al. (2003) Isolation of peptides useful for differential diagnosis of Crohn’s 32. Getz JA, Schoep TD, Daugherty PS (2012) Peptide discovery using bacterial display and disease and ulcerative colitis. Gut 52(4):535–540. flow cytometry. Methods Enzymol 503:75–97.

Ballew et al. PNAS | November 26, 2013 | vol. 110 | no. 48 | 19335 Downloaded by guest on September 28, 2021