Massively parallel kinetics reveals the substrate recognition landscape of the metalloprotease ADAMTS13

Colin A. Kretza, Manhong Daib, Onuralp Soylemezc,d, Andrew Yeea, Karl C. Desche, David Siemieniaka,f, Kärt Tombergg, Fyodor A. Kondrashovc,d,h, Fan Mengb, and David Ginsburga,e,f,g,i,1

aLife Sciences Institute, University of Michigan, Ann Arbor, MI 48109; bDepartment of Psychiatry and Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109; cBioinformatics and Genomics Programme, Centre for Genomic Regulation, Barcelona, Spain 08003; dUniversitat Pompeu Fabra, Barcelona, Spain 08003; eDepartment of Pediatrics, University of Michigan Medical School, Ann Arbor, MI 48109; fHoward Hughes Medical Institute, Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109; gDepartment of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109; hInstitució Catalana de Recerca i Estudis Avançats, Barcelona, Spain 08003; and iDepartment of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109

Contributed by David Ginsburg, June 13, 2015 (sent for review April 29, 2015; reviewed by Gregg B. Fields) play important roles in many biologic processes and are (11) has facilitated detailed analysis of the changing complexity key mediators of cancer, inflammation, and thrombosis. However, within a phage display library (12–16) without requiring multiple comprehensive and quantitative techniques to define the substrate rounds of selection and amplification. By sequencing millions of specificity profile of proteases are lacking. The metalloprotease phage particles, changes in the composition of the library can be ADAMTS13 regulates by cleaving von Willebrand precisely monitored. factor (VWF), reducing its procoagulant activity. A mutagenized To explore the utility of this approach, we focused on the substrate phage display library based on a 73-amino acid fragment ADAMTS13. A member of the Metzincin family of metalloproteases of VWF was constructed, and the ADAMTS13-dependent change in (17), ADAMTS13 is unique among proteases because it circulates in library complexity was evaluated over reaction time points, using an active form, has no known natural inhibitors, and has only a single high-throughput sequencing. Reaction rate constants (kcat/KM) were known substrate, (VWF) (18). Deficiency in calculated for nearly every possible single amino acid substitution ADAMTS13 results in thrombotic thrombocytopenic purpura, a within this fragment. This massively parallel anal- devastating blood coagulation disorder characterized by patholog- ysis detailed the specificity of ADAMTS13 and demonstrated the ical deposition of thrombi in the microvasculature (19–21). In critical importance of the P1-P1′ substrate residues while defining contrast, mutations within the VWF A2 domain that increase exosite binding domains. These data provided empirical evidence susceptibility to ADAMTS13 cleavage result in a phe- for the propensity for epistasis within VWF and showed strong notype (type 2A ) (22). A 73-amino acid correlation to conservation across orthologs, highlighting evolution- fragment of the VWF A2 domain, termed VWF73, was identified ary selective pressures for VWF. as an efficient substrate of ADAMTS13 (3) and forms the basis for clinical assays of ADAMTS13 activity (23). phage display | protease | high-throughput sequencing | ADAMTS13 | Given the importance of ADAMTS13/VWF interactions, we von Willebrand factor combined substrate phage display of a mutagenized VWF73 library with high-throughput sequencing. This methodology enabled mas- rotease specificity is critical for maintaining diversity and com- sively parallel enzyme kinetic measurements and permitted the k K Ppartmentalization of function, and is tightly controlled. For calculation of the apparent rate constant ( cat/ M) for single or many proteases, a substrate initially docks to an exosite, which captures and orients the substrate scissile bond toward the active Significance siteoftheenzyme.Attheactivesite,thePx-Px′ (1) substrate amino acid side chains align with the complementary Sx-Sx′ pockets of Here we report a method to rapidly examine the effect of the enzyme to optimize recognition by the residues that nearly all possible single amino acid substitutions within a execute the proteolytic reaction (2). substrate fragment of the coagulation protein von Willebrand Conventional techniques for probing the substrate recognition factor (VWF) on the efficiency of cleavage by its cognate pro- requirements of a protease are cumbersome and time-consuming tease, ADAMTS13. A substrate phage display library was gen- and require intimate knowledge of the enzyme/substrate pair. erated containing ∼3.5 × 107 independent clones and uncleaved Such methods include engineering deletion mutants (3), use of phages collected at multiple reaction time points after reaction competitive ligands (4, 5), and site-directed mutagenesis (6, 7). with ADAMTS13. Analysis of these phages by high-throughput k K In contrast to these techniques, substrate phage display is a high- sequencing facilitated simultaneous calculations of cat/ M values throughput, unbiased approach to studying protease substrate for multiple substitutions at each position of this protein frag- specificity (8–10). In this method, a library consisting of 106–109 ment, providing a comprehensive picture of the substrate rec- independent phage clones, each expressing a unique potential ognition landscape for the interaction between ADAMTS13 and substrate on its surface, is panned for multiple rounds with a VWF. This approach should be broadly applicable to many other protease, and the cleaved or uncleaved phages after each re- protease/substrate pairs. action are removed and amplified for subsequent rounds of se- Author contributions: C.A.K., K.C.D., and D.G. designed research; C.A.K. performed re- lection. In this manner, the library complexity is iteratively search; C.A.K., M.D., O.S., A.Y., D.S., K.T., F.A.K., F.M., and D.G. analyzed data; and C.A.K., reduced and becomes populated by peptide sequences that are O.S., F.A.K., F.M., and D.G. wrote the paper. most informative. This methodology, although useful, is limited by Reviewers included: G.B.F., Torrey Pines Institute for Molecular Studies. the number of clones selected for individual Sanger sequencing The authors declare no conflict of interest. after the last round of selection, and the selection of phages based 1To whom correspondence should be addressed. Email: [email protected]. on competitive growth advantages unrelated to enzyme specificity. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. The availability of high-throughput DNA sequencing technology 1073/pnas.1511328112/-/DCSupplemental.

9328–9333 | PNAS | July 28, 2015 | vol. 112 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1511328112 Downloaded by guest on September 23, 2021 latter group of substitutions, 96% (235) were represented, with the remainder accounting for all 10 of the amino acid changes missing from the library. Overall, 99% of possible single amino acid sub- stitutions and 51% of possible double amino acid substitutions are represented (Table 1). After the addition of recombinant ADAMTS13 to phages bearing WT VWF73, proteolysis followed predicted substrate consumption kinetics performed under pseudo-first-order conditions (Fig. 1D). Near-complete substrate consumption (99.2%) was observed by 120 min. In comparison, the mutant VWF73 library was con- sumed less efficiently, with maximal substrate consumption at 600 min exhibiting only ∼80% cleavage (Fig. 1D). These data suggest the mutant VWF73 library contained variants that were resistant to ADAMTS13 cleavage.

High-Throughput Sequencing of Uncleaved Phages Defines the P3–P2′ and Two Important Exosites as Critical for Proteolysis. The mutation profile of the mutant VWF73 library was examined at 5 time points after the addition of ADAMTS13 (from 0 to 600 min) (SI Fig. 1. Assembly and characterization of mutant VWF73 substrate phage Appendix, Table S1). WT VWF73, which accounted for 0.056% display library. (A) VWF fragments encoding Asp1596–Arg1668 were syn- ∼ thesized with 1% mutant nucleotides at each position. (B) The distribution of the total reads in the unselected library, decreased 30-fold by of mutation frequency for nucleotides in the unselected library shows a the 600-min reaction time. We applied the Enrich pipeline (25) right-tailed normal distribution centered around 3%. (C) The unselected li- to evaluate the fold-change in mutation frequency from 0 to 600 min. brary shows an equivalent rate of amino acid substitutions at each position Enrichment of VWF73 variants after selection of uncleaved of VWF73, although a higher frequency was seen at Tyr10. The bars repre- phages correspond to substitutions that resisted cleavage by sent the overlap between mutant synthetic oligonucleotides used for library ADAMTS13. Overall, 590 amino acid substitutions were signifi- assembly. The mutation frequency in the regions of overlapping mutant cantly enriched and 178 were significantly depleted (false discovery ∼ oligonucleotides was 2.1%, which was only 1% lower than the average rate, pFDR < 0.05) relative to the corresponding WT amino acid in mutation frequency across the entire length of VWF73. The comparable amino the selected library (SI Appendix,Fig.S3). Fig. 2 shows the change acid diversity across overlapping oligonucleotides suggests this approach Top could be extended to larger proteins. Only the last nucleotide exhibited a in mutation frequency at each position of VWF73 (Fig. 2, )and markedly lower mutation frequency (0.067%; SI Appendix, Fig. S2), possibly for each amino acid variant (Fig. 2, Bottom,andDataset S1)and reflecting incomplete removal of the nonrandomized oligonucleotides that defines three distinct mutation-sensitive segments within VWF73. preceded this position during synthesis. The reduced mutation frequency at Segment A, spanning Pro6–Val12 of VWF73, includes the Tyr10– this wobble codon position did not appreciably impact the amino acid di- Met11 scissile bond and showed the greatest enrichment of mu- versity. (D) Phages displaying a mutagenized VWF73 library (●)orWT tation frequency. Segment B spans Asp19–Asp27 and contains ■ VWF73 ( ) were reacted with 5 nM ADAMTS13 at 37 °C. Uncleaved phages Ile21, a previously identified exosite-binding residue (26). Segment were quantified at reaction time points, using AlphaLISA. The data were fit C spans Pro53–Leu69 and contains a number of residues exhibiting by nonlinear regression in Graph Pad, using the one-phase exponential de- Bottom cay equation. Data represent the mean of three independent experiments. mutation enrichment after selection (Fig. 2, ). These data are consistent with previous, much more limited, biochemical studies (3, 27–30) and provide a high-resolution substrate recog- double amino acid substitution or substitutions within the mutant nition landscape for ADAMTS13. VWF73 library. The data provide a comprehensive substrate Massively Parallel k /K Determination for Variants in the Mutant recognition landscape for ADAMTS13, defining exosite bind- cat M BIOCHEMISTRY ing domains within VWF73 at amino acid resolution level. This VWF73 Library. Enzyme/substrate reactions are governed by a k methodology provides a general platform for rapid and com- reaction rate constant, , that describes the likelihood of a pro- prehensive protease profiling that should be readily adaptable ductive interaction leading to substrate hydrolysis. Rate con- to many other protease/substrate pairs. stants are traditionally determined as individual reactions. Because ADAMTS13 cleavage of the mutant VWF73 phage library fol- Results lowed first-order kinetics (Fig. 1D), individual rate constants could Cleavage of a VWF73 Variant Library by ADAMTS13 Follows First-Order be determined for each mutant in the library, using high-throughput Reaction Kinetics and Contains Sequences Resistant to Proteolysis. A sequencing data collected from the uncleaved phages at reaction library of VWF fragments encoding variants of Asp1596–Arg1668 time points. The mutation frequency enrichment profile for each SI Appendix (VWF73) was constructed with synthetic oligonucleotides (SI time point ( ,Fig.S4) resembled the profile at 600 min Top Appendix, Fig. S1) that contain the wild-type (WT) nucleotide at (Fig. 2, ), with a time-dependent increase in the magnitude of 97% frequency at each position and the three mutant nucleotides A at a 1% frequency each (Fig. 1 ). The library was cloned into the Table 1. Complexity of mutant VWF73 phage library FUSE55 M13 filamentous phage display vector (24), yielding ∼3.5 × 107 independent clones. High-throughput sequencing of Single substitution Double substitutions the mutant VWF73 library identified 3,684,035 unique peptide Sequence Exp Obs Exp Obs sequences, with 3.02% average mutation frequency at each nu- cleotide position (Fig. 1B), resulting in a 6.79% average sub- Nucleotide 657 657 214,839 214,678 stitution rate at each of the 73 amino acid positions (Fig. 1C). Protein 1,387 1,377 948,708 542,422 Ofthe1,387possiblesingleamino acid substitutions within × High-throughput sequencing of the unselected phages reveals the com- VWF73 (73 19), 1,377 were present in the unselected library. plexity of the mutant VWF73 library. The number of expected (Exp) single Notably, 245 of these 1,387 amino acid substitutions require changes substitutions at each position are compared with the observed (Obs) for both at all three nucleotides of the codon and were expected to be under- nucleotide and amino acid substitutions. Similarly, all combinations of two 3 −6 represented (frequency of 0.01 = 10 ). Nonetheless, among this substitutions observed are compared with expected.

Kretz et al. PNAS | July 28, 2015 | vol. 112 | no. 30 | 9329 Downloaded by guest on September 23, 2021 Fig. 2. Enrichment after selection of mutant VWF73 substrate phage library. Enrichment of VWF73 se- quences that remained uncleaved after a 600-min reaction with 5 nM ADAMTS13 was performed using the Enrich analysis package (12, 25). The average en- richment in mutation frequency at each position of VWF73 was calculated by dividing the mutation frequency in the selected uncleaved phages after a 600-min reaction with ADAMTS13 by the mutation frequency in the starting library (Top). Enrichment in

mutation frequency (log2 average enrichment > 1) in- dicates positions of VWF73 at which mutations, on average, inhibit cleavage by ADAMTS13, whereas a

depleted mutation frequency (log2 average enrich- ment < 1) indicates positions at which mutations on average enhance cleavage. Three segments (indicated as A, B, and C) highlight amino acid intervals at which amino acid substitutions tended to inhibit cleavage. A

heatmap of the log2 of enrichment for every single amino acid substitution at each position of VWF73 (Bottom). Enriched substitutions (red) indicate amino acid substitutions that inhibit cleavage by ADAMTS13. Depleted substitutions (blue) indicate amino acid sub- stitutions that enhance cleavage. Amino acid sub- stitutions that were represented by less than 10 reads (white) were not analyzed.

enrichment. The apparent pseudo-first-order rate constant, kapp, VWF73. The one exception is seen at Tyr10, likely explained by calculated for WT VWF73 within the library, was 0.2888 ± the unique sensitivity of this position to amino acid substitutions k K × 7 −1· −1 SI 0.001, yielding a cat/ M value of 5.78 10 M min ( (Fig. 2). Substitutions at this P1 position decrease the average Appendix , Fig. S5), which is comparable to the value determined kcat/KM value at every other position of VWF73, but this effect is for the recombinant WT VWF73 peptide (Table 2) (26–28). lost when P1 Tyr is held constant. Although 1,377/1,387 amino acid substitutions were represented Because epistatic interactions within VWF73 appear to be in the starting library, only 772 were present as single variants; that rare, the contribution of amino acid substitutions toward kcat/KM is, a phage clone containing only a single amino acid substitution values can be considered independently. Therefore, the effect of (SI Appendix,Fig.S6A). No cleavage was detected in 42 clones, SI Appendix each amino acid substitution can be averaged regardless of the representing the most resistant substitutions ( ,Fig. presence of substitutions elsewhere in the VWF73 sequence. S6B,andDataset S1). Overall, 363 of the 772 single-variant sub- Such average measurements will be referred to as the aggregate strates exhibited k /K values that were significantly inhibited cat M mutant. Of all possible 1,387 single amino acid substitutions, 1,300 relative to WT VWF73 (pFDR < 0.05). Excellent agreement was k K were present with a sufficient number of aggregate mutant reads observed between these cat/ M values and those for recombinant k K B k K peptides randomly selected or previously described in the litera- for cat/ M determination (Fig. 3 ). The cat/ M for the aggregate ture (26) (Table 2) (R2 = 0.94; SI Appendix,Fig.S7A). The relatively high rate of mutation in the library resulted in k K ∼ Table 2. Comparison of apparent cat/ M from phage screen 98.2% of sequences containing two or more amino acid sub- with recombinant peptides stitutions. Establishing the contribution of each substitution to k /K (×107 M−1·min−1) the kcat/KM values of a sequence could be biased by epistatic in- cat M teractions between the amino acids. To test the degree of epistasis VWF73 mutant Peptide Phage between amino acid residues in VWF73, the kcat/KM values for pairs of amino acid substitutions were calculated and compared WT *1.4 ± 0.8 5.78 ± 0.02 ǂ with the values obtained for either single variant alone. No sig- L8Q *0.029 ± 0.031 nd nificant epistatic interactions were detected (Materials and Meth- L8R *0.0056 ± 0.001 nd ods), possibly reflecting the lack of substantial tertiary structure Y10D *0.012 ± 0.01 nd within VWF73 (31), although our data are underpowered to detect M11T *0.015 ± 0.01 nd interactions between residues with small effects. I21N *0.029 ± 0.011 nd We then tested whether accurate kcat/KM values could be R2W *3.5 ± 1.6 7.67 ± 0.09 † calculated for VWF73 variants using reads derived from clones Q4D 1.29 ± 0.07 7.76 ± 0.46 † containing one or more additional amino acid substitution dis- E65R 0.96 ± 0.08 1.98 ± 0.22 tinct from the primary amino acid variant of interest. Calculation L24M† 1.31 ± 0.05 6.98 ± 0.14 † of the kcat/KM for the set of reads with the WT amino acid held L44A 1.42 ± 0.08 5.29 ± 0.12 constant at each position of VWF73 yielded an average value of − − 6.6 × 106 ± 3.4 × 104 M 1·min 1, which is ninefold slower than Apparent kcat/KM values calculated from sequencing data (Phage) were k K compared with values determined for recombinant peptides (Peptide). Val- the cat/ M for WT VWF73, but with remarkably similar values ± A ues are presented as a mean SEM. calculated at each position (Fig. 3 ). These data confirm that the *Values obtained from ref. 26. effect of multiple amino acid substitutions is mostly averaged †Amino acid substitutions analyzed here (Materials and Methods). across sequences sharing a given amino acid at a position of ǂCleavage was not detected.

9330 | www.pnas.org/cgi/doi/10.1073/pnas.1511328112 Kretz et al. Downloaded by guest on September 23, 2021 mutants exhibited a strong correlation with the corresponding However, every amino acid at both Ile54 and Ile56 inhibited single substitution variant (R2 = 0.74) (SI Appendix,Fig.S7C), as cleavage, suggesting an important role for Ile at these positions. well as with enrichment at 600 min (R2 = 0.68) (SI Appendix,Fig. Superficial exosite binding regions of VWF73 have previously β S7D). The majority (70%) of variants exhibited kcat/KM values that been shown to correspond to disordered loops and -sheet sec- fit the first-order model, with R2 values between 0.76 and 1.0 (SI ondary structures (32), which are known to have amino acid Appendix, Fig. S7E). Overall, 462/1,300 aggregate mutants R-groups with alternating 180° orientations. Indeed, VWF73 exhibited kcat/KM values that were significantly (pFDR < 0.05) segments B and C residues exhibit alternating sensitivity to sub- different from WT VWF73 (SI Appendix, Fig. S11A), with these stitutions (Fig. 3C), consistent with an orientation of amino acid mutants clustered at regions of VWF73 most sensitive to sub- R-groups in the same face toward ADAMTS13, with comple- stitutions (compare SI Appendix, Fig. S8 and Fig. 2, Top). mentary pockets on ADAMTS13 that prefer Asp19, Ile21, Arg23, Fig. 3C illustrates the kcat/KM for every aggregate mutation and Asp27 or Ile54, Ile56, Asp58, and Glu60 (6, 33). In contrast, at each position of VWF73 (see Dataset S1). These values the active site of ADAMTS13 lies within a pocket, where substrate exhibited excellent agreement with rate constants determined residues of all orientations would be expected to interact. Con- experimentally for a select set of recombinant VWF73 peptides sistent with this model, the VWF73A segment contains sequential (SI Appendix,Fig.S7B). Of the 100 most damaging substitutions, amino acids in which substitutions inhibit ADAMTS13-dependent 32 were within the P3-P2′ interval and 13 were at position P1 proteolysis (Fig. 3C). (Tyr10). As expected, amino acid residues with similar bio- – chemical properties exhibited comparable kcat/KM values. For ex- Deletion of NH3 or COOH Termini or Replacement of Gln29 Pro50 ample, Tyr10Phe was cleaved as efficiently as WT VWF73 and Confirms VWF73 Sequences Dispensable to ADAMTS13-Dependent was the only substitution at this position that did not inhibit Proteolysis. The quantitative kinetic data define three important VWF73 proteolysis. Although substitutions within the segments contacts between ADAMTS13 and VWF73 (segments A–C, Figs. designated VWF73A and VWF73B (Fig. 2, Top) were the most 2 and 3). In contrast, substitutions within Asp1–Ala5, Gln29– likely to inhibit cleavage by ADAMTS13, the magnitude of change Pro50, and Val70–Arg73 had little effect on the rate of proteolysis. at any VWF73 position varied with the specific amino acid sub- Consistent with these data, deletion or replacement of these seg- stitution. For example, Val9 was inhibited by all substitutions ex- ments (SI Appendix,Fig.S9) had little effect on cleavage by cept Arg and Ala, whereas substitutions at position Thr61 ADAMTS13. Although Gln29–Pro50 is dispensable for cleavage exhibited both enhanced and/or inhibited rates of VWF73 cleav- of VWF73, several amino substitutions (Arg46Asp, Arg46His, or age, depending on the amino acid change (Fig. 3C). Within the Glu45Ile) in this region were observed with increased rates of VWF73C segment, multiple positions showed both enhancement proteolysis (Fig. 3C), suggesting additional exosite-binding sites and inhibition, depending on the specific amino acid substitution. can be engineered to optimize substrate recognition. BIOCHEMISTRY

Fig. 3. kcat/KM for aggregate single amino acid substitutions. Sequencing uncleaved phages from 0-, 2-, 5-, and 10-min reaction time points was used to monitor the change in mutation frequency as a function of time. The frequency for each amino acid variant at each position was derived from VWF73 clones

bearing multiple amino acid substitutions (aggregate mutant). The apparent pseudo-first-order rate constant (kcat/KM) was calculated by fitting the normalized counts at each time point to Eq. S1.(A)Thekapp for the WT amino acid at each position of VWF73 was calculated using sequencing reads derived from the majority of clones containing multiple amino acid substitutions. The difference between the kcat/KM for the WT amino acid at each position relative to the kcat/KM 7 −1 −1 for pure WT VWF73 (5.78 × 10 M ·min ) serves as a correction factor for estimating the true kcat/KM for amino acid variants at the corresponding position of as each should share a subset of additional amino acid variants throughout VWF73. (B)Thekcat/KM for each single amino acid substitution at each position of VWF73 was calculated and normalized to WT VWF73. Values greater than 1 represent variantsthatarecleavedatafasterratethan WT VWF73, and values less than 1 arecleaved

slower. (C) The average kcat/KM for each position of VWF73 (bar plot) and the individual values for each substitution at each position (heatmap) is shown. Amino acid substitutions within VWF73 that exhibit kcat/KM values faster (blue), equivalent (yellow), or slower (red) than WT VWF73 are illustrated. An additional 87 amino acid variants (white) indicate less than 10 reads in any of the four time points, which is insufficient for analysis. Trp and Met substitutions are under-representedinthe dataset, as each is only encoded by a single codon and is therefore less likely to be present with sufficient reads at all reaction time points.

Kretz et al. PNAS | July 28, 2015 | vol. 112 | no. 30 | 9331 Downloaded by guest on September 23, 2021 AssessingtheEffectofType2AVWDMutationsWithinVWF73Requires non-VWD-2A and VWD-2A mutations (36/261 versus 6/122, Additional Structural Features of VWF. Type 2A von Willebrand dis- respectively; P = 0.01), suggesting the propensity for epistasis ease (VWD-2A) can result from mutations within the VWF A2 varies among VWD subtypes. The loss of cleavage mutations domain that potentiate cleavage by ADAMTS13 (22). Analysis of were still less likely to be observed in other species compared with the kcat/KM values for known VWD-2A mutations within the the VWD-2A mutations (0/141 versus 6/122, respectively; P = VWF73 library showed that in contrast to the expected enhance- 0.01). However, the proportion of mutations that lower kcat/KM ment, 13/25 VWD-2A mutations resisted cleavage by ADAMTS13 values (78/618) found in other species was similar to the pro- (SI Appendix,Fig.S10), and the remaining mutations exhibited only portion of mutations that increase kcat/KM values (78/541), and modest enhancements in VWF73 proteolysis. A number of VWD- neither were statistically significant from the corresponding pro- 2A mutations have been shown to lower the threshold of shear portion of all disease mutations or VWD-2A mutations (P >> 0.05 required to unfold the VWF A2 domain, more readily exposing the for all comparisons). These data indicate both that the average cryptic scissile bond to ADAMTS13 (32, 34–36). Taken together, effect of a human disease mutation is intermediate to complete k K these data strongly suggest that VWF73 requires additional struc- loss of function, and that both the slower and faster cat/ M mu- tural features within full-length VWF to effectively model the effect tations are equally deleterious. of VWD-2A mutations on ADAMTS13 proteolysis. These data indicate several aspects of VWF73 mutations that cause significant change to kcat/KM. First, the low propensity of VWF73 Evolutionary Sequence Conservation Reflects Effect on Cleavage epistasis across evolution suggests the potential value of evolution- by ADAMTS13. Although not evident among the mutations we ary information to predict the functional effects of mutations in this assayed, epistasis may occur on larger evolutionary scales (37–40). segment of human VWF. Second, mutations that eliminate VWF73 To test for epistasis on the macroevolutionary scale, we compared cleavage altogether appear less epistatic than either their weaker- the effect of aggregate mutations in VWF73 on kcat/KM with the effect counterparts or known human disease mutations. Taken to- per site rate of evolution in vertebrate orthologous sequences (41, gether, these observations suggest that loss of cleavage mutations 42). Sites with a greater effect on kcat/KM tended to evolve slower produces a stronger phenotype than known disease mutations. than sites with a lesser effect (Fig. 4), indicating that evolutionary conservation reflects the average effect of a mutation in the hu- Discussion man VWF73 sequence. This report demonstrates the power of combining substrate phage The apparent lack of epistasis in mutational and evolutionary display with high-throughput sequencing to characterize the sub- data raises the possibility that VWF73 contains fewer epistatic in- strate sequence requirements for protease recognition with un- teractions than the rest of the VWF protein sequence. Thus, we precedented single amino acid resolution. Analysis of sequence mapped human disease mutations to the amino acid states found in data from multiple time points enables massively parallel de- VWF orthologs of other species, which allowed us to quantify the termination of apparent rate constants for a comprehensive set of prevalence of long-term epistasis within VWF (37–39). The pro- potential amino acid substitutions. We applied this analysis to the portion of disease mutations found in other species within VWF73 interaction between ADAMTS13 and VWF, providing the first was equivalent to the rest of the VWF protein (2/29 and 40/354, high-resolution substrate recognition landscape for these impor- respectively; P = 0.39). None of the mutations that led to total loss tant regulators of blood coagulation. of cleavage matched an amino acid state found in a vertebrate ADAM, ADAMTS, and matrix metalloproteases (MMPs) ortholog, which was significantly lower than the number of all form a family of related metalloproteases with broad biologic − disease mutations (0/141 versus 42/383, respectively; P ∼ 10 6). importance (17). MMPs have previously been shown to exhibit We also observe a significant difference between the pro- promiscuity for residues surrounding the substrate scissile bond portion of mutations found in other species when we distinguish (8, 43, 44). Screening a number of MMPs (including MMP-1, MMP-2, MMP-3, MMP-7, MMP-14, and MMP-26) against a panel of peptides (45) revealed promiscuity for all residues in the P4-P4′interval except P1′, which exhibited a preference for ali- phatic amino acids (46). However, a broad range of amino acids at all other position, including P1, permitted cleavage. In con- trast, we demonstrate here that VWF73 peptides bearing sub- stitutions at or near the P1-P1′ interval are the most resistant to ADAMTS13 proteolysis, suggesting a strong preference for amino acid content near the scissile bond. At the P1 position, the most stringent in all of VWF73, only the conservative Tyr10Phe substitution exhibited a comparable kcat/KM to the WT amino acid. The observation that the P3-P2′ interval is highly con- served throughout vertebrate evolution also supports the impor- tance of these residues for efficient proteolysis by ADAMTS13. ADAMTS4, which is more closely related to ADAMTS13 than the MMPs (47), has also been shown to exhibit strong preference for residues surrounding the scissile bond, including Glu at P1, bulky hydrophobic residues at P1′, and Arg/Lys at P2′ (48). That Fig. 4. Per site rate of evolution in VWF Asp1596–Arg1668 correlates with ADAMTS4 is only distantly related to ADAMTS13 (47) may kcat/KM. The average kcat/KM for amino acid substitutions at each position of explain the lack of shared preferences for residues surrounding VWF73 was plotted as a function of the rate of evolution at that position. the scissile bond. However, both ADAMTS13 and ADAMTS4 The kcat/KM values for all positions with a similar rate of evolution were share the feature of tight P1-P1′ specificity, which could emerge ■ 2 averaged ( ), and linear regression analysis was performed, revealing an R as an important theme in metalloprotease diversification value of 0.93, illustrating good agreement between kinetic rate constants and the rate of evolution at each position of VWF73. Lower values of rate of during evolution. evolution indicate conserved (relatively slow evolving) sites, whereas higher The increasing application of whole-exome and whole-genome

values correspond to fast-evolving, highly variable sites. (●) Average kcat/KM sequencing to human populations, particularly in the clinical setting, value and the average rate of evolution across 73 amino acid position on the has created a pressing need for more accurate predictions of the x and y axes, respectively. effect of identified rare nonsynonymous variants on protein

9332 | www.pnas.org/cgi/doi/10.1073/pnas.1511328112 Kretz et al. Downloaded by guest on September 23, 2021 function. Currently available predictive computer algorithms (e.g., in the cloning of the mutant VWF73 library, recombinant peptides, and high- SIFT and PolyPhen2) exhibit high error rates (49). The approach throughput sequencing libraries are provided in SI Appendix, Tables S1–S4. described here could facilitate high-throughput cataloging of all potential mutations and their quantitative effect on protease or ACKNOWLEDGMENTS. We thank Isabel Wang and Vivian Cheung from the Life Sciences Institute, University of Michigan, for assistance with high- substrate function, allowing accurate prediction of disease risk/ throughput sequencing experiments and valuable discussions. We also thank severity for many important human diseases, including VWD J. Evan Sadler (Washington University) and Sriram Krishnaswamy (Children’s and thrombotic thrombocytopenic purpura. The annotation of Hospital of Philadelphia) for helpful discussions. We thank Jeff Weitz more than 550 proteases in the (accounting for (McMaster University), Jim Fredenburgh (McMaster University), and Steve Weiss (University of Michigan) for critical review of the manuscript. C.A.K. 2% of known ), with an anticipated similar or greater set of was awarded the Judith Graham Pool Fellowship from National Hemophilia protein substrates, suggests this approach could be extended to a Foundation. This work was supported by the National Institutes of Health number of other genes of clinical relevance. (R01 HL039693), the National Heart, , and Blood Institute (P01- HL057346), Ministerio de Economía y Competitividad Grants BFU2012- Materials and Methods 31329 and Sev-2012-0208, and European Research Council Starting Grant 335980_EinME. D.G. is an investigator of the Howard Hughes Medical In- Full details of the materials and methods used for this study are presented in SI stitute, and F.A.K. is a Howard Hughes Medical Institute International Early Appendix, Materials and Methods. Oligonucleotides and primer sequences used Career Scientist.

1. Schechter I, Berger A (1967) On the size of the active site in proteases. I. Papain. Bi- 26. Desch KC, et al. (2015) Probing ADAMTS13 substrate specificity using phage display. ochem Biophys Res Commun 27(2):157–162. PLoS ONE 10(4):e0122931. 2. Drag M, Salvesen GS (2010) Emerging principles in protease-based drug discovery. Nat 27. Gao W, Anderson PJ, Majerus EM, Tuley EA, Sadler JE (2006) Exosite interactions Rev Drug Discov 9(9):690–701. contribute to tension-induced cleavage of von Willebrand factor by the antith- 3. Kokame K, Matsumoto M, Fujimura Y, Miyata T (2004) VWF73, a region from D1596 rombotic ADAMTS13 metalloprotease. Proc Natl Acad Sci USA 103(50):19099–19104. to R1668 of von Willebrand factor, provides a minimal substrate for ADAMTS-13. 28. Gao W, Anderson PJ, Sadler JE (2008) Extensive contacts between ADAMTS13 exosites Blood 103(2):607–612. and von Willebrand factor domain A2 contribute to substrate specificity. Blood 4. Luken BM, et al. (2005) The spacer domain of ADAMTS13 contains a major binding 112(5):1713–1719. site for in patients with thrombotic thrombocytopenic purpura. Thromb 29. Igari A, et al. (2012) Identification of epitopes on ADAMTS13 recognized by a panel of Haemost 93(2):267–274. monoclonal antibodies with functional or non-functional effects on catalytic activity. 5. Kretz CA, et al. (2010) HD1, a thrombin- and prothrombin-binding DNA aptamer, Thromb Res 130(3):e79–e83. inhibits thrombin generation by attenuating prothrombin activation and thrombin 30. de Groot R, Lane DA, Crawley JT (2015) The role of the ADAMTS13 cysteine-rich – feedback reactions. Thromb Haemost 103(1):83–93. domain in VWF binding and proteolysis. Blood 125(12):1968 1975. 6. Xiang Y, de Groot R, Crawley JT, Lane DA (2011) Mechanism of von Willebrand factor 31. Crawley JT, de Groot R, Xiang Y, Luken BM, Lane DA (2011) Unraveling the scissile scissile bond cleavage by a and with a bond: How ADAMTS13 recognizes and cleaves von Willebrand factor. Blood 118(12): – type 1 motif, member 13 (ADAMTS13). Proc Natl Acad Sci USA 108(28):11602–11607. 3212 3221. 7. Myles T, Yun TH, Hall SW, Leung LL (2001) An extensive interaction interface between 32. Zhang Q, et al. (2009) Structural specializations of A2, a force-sensing domain in the thrombin and factor V is required for factor V activation. J Biol Chem 276(27):25143–25149. ultralarge vascular protein von Willebrand factor. Proc Natl Acad Sci USA 106(23): – 8. Kridel SJ, et al. (2002) A unique substrate binding mode discriminates membrane 9226 9231. type-1 from other matrix . J Biol Chem 33. Akiyama M, Takeda S, Kokame K, Takagi J, Miyata T (2009) Crystal structures of the noncatalytic domains of ADAMTS13 reveal multiple discontinuous exosites for von 277(26):23788–23793. Willebrand factor. Proc Natl Acad Sci USA 106(46):19274–19279. 9. Gallwitz M, Enoksson M, Thorpe M, Hellman L (2012) The extended cleavage speci- 34. Xu AJ, Springer TA (2013) Mechanisms by which von Willebrand disease mutations ficity of human thrombin. PLoS ONE 7(2):e31756. destabilize the A2 domain. J Biol Chem 288(9):6317–6324. 10. Ratnikov BI, et al. (2014) Basis for substrate recognition and distinction by matrix 35. Zhang X, Halvorsen K, Zhang CZ, Wong WP, Springer TA (2009) Mechanoenzymatic cleavage metalloproteinases. Proc Natl Acad Sci USA 111(40):E4148–E4155. of the ultralarge vascular protein von Willebrand factor. Science 324(5932):1330–1334. 11. Shendure J, Lieberman Aiden E (2012) The expanding scope of DNA sequencing. Nat 36. Zanardelli S, et al. (2006) ADAMTS13 substrate recognition of von Willebrand factor Biotechnol 30(11):1084–1094. A2 domain. J Biol Chem 281(3):1555–1563. 12. Fowler DM, et al. (2010) High-resolution mapping of protein sequence-function re- 37. Breen MS, Kemena C, Vlasov PK, Notredame C, Kondrashov FA (2012) Epistasis as the lationships. Nat Methods 7(9):741–746. primary factor in molecular evolution. Nature 490(7421):535–538. 13. Ravn U, et al. (2013) Deep sequencing of phage display libraries to support 38. Kondrashov AS, Sunyaev S, Kondrashov FA (2002) Dobzhansky-Muller incompatibilities discovery. Methods 60(1):99–110. in protein evolution. Proc Natl Acad Sci USA 99(23):14878–14883. 14. ’t Hoen PA, et al. (2012) Phage display screening without repetitious selection rounds. 39. Soylemez O, Kondrashov FA (2012) Estimating the rate of irreversibility in protein –

Anal Biochem 421(2):622 631. BIOCHEMISTRY evolution. Genome Biol Evol 4(12):1213–1222. 15. Larman HB, et al. (2013) PhIP-Seq characterization of autoantibodies from patients with 40. Rosenbloom KR, et al. (2015) The UCSC Genome Browser database: 2015 update. multiple sclerosis, type 1 diabetes and rheumatoid arthritis. J Autoimmun 43:1–9. Nucleic Acids Res 43(Database issue):D670–D681. 16. Larman HB, et al. (2011) Autoantigen discovery with a synthetic human peptidome. 41. Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010 ) ConSurf 2010: Calculating – Nat Biotechnol 29(6):535 541. evolutionary conservation in sequence and structure of proteins and nucleic acids. 17. Huxley-Jones J, et al. (2007) The evolution of the vertebrate metzincins; insights from Nucleic Acids Res 38(Web Server issue):W529–W533. Ciona intestinalis and Danio rerio. BMC Evol Biol 7:63. 42. Berezin C, et al. (2004) ConSeq: The identification of functionally and structurally 18. Yee A, Kretz CA (2014) Von Willebrand factor: Form for function. Semin Thromb important residues in protein sequences. Bioinformatics 20(8):1322–1324. – Hemost 40(1):17 27. 43. Chen EI, et al. (2002) A unique substrate recognition profile for matrix metal- 19. Levy GG, et al. (2001) Mutations in a member of the ADAMTS family cause loproteinase-2. J Biol Chem 277(6):4485–4491. – thrombotic thrombocytopenic purpura. Nature 413(6855):488 494. 44. Ratnikov B, Cieplak P, Smith JW (2009) High throughput substrate phage display for 20. Zheng X, et al. (2001) Structure of von Willebrand factor-cleaving protease (ADAMTS13), a protease profiling. Methods Mol Biol 539:93–114. metalloprotease involved in thrombotic thrombocytopenic purpura. J Biol Chem 276(44): 45. Turk BE, Huang LL, Piro ET, Cantley LC (2001) Determination of protease cleavage site – 41059 41063. motifs using mixture-based oriented peptide libraries. Nat Biotechnol 19(7):661–667. 21. Furlan M, Robles R, Lämmle B (1996) Partial purification and characterization of a 46. Park HI, Turk BE, Gerkema FE, Cantley LC, Sang QX (2002) Peptide substrate specificities protease from human plasma cleaving von Willebrand factor to fragments produced and protein cleavage sites of human endometase/matrilysin-2/matrix metalloproteinase-26. by in vivo proteolysis. Blood 87(10):4223–4234. JBiolChem277(38):35168–35175. 22. Bowen DJ (2004) Increased susceptibility of von Willebrand factor to proteolysis by 47. Nicholson AC, Malik SB, Logsdon JM, Jr, Van Meir EG (2005) Functional evolution of ADAMTS13: Should the multimer profile be normal or type 2A? Blood 103(8):3246. ADAMTS genes: Evidence from analyses of phylogeny and gene organization. BMC 23. Mackie I, et al. (2013) Discrepancies between ADAMTS13 activity assays in patients Evol Biol 5:11. with thrombotic microangiopathies. Thromb Haemost 109(3):488–496. 48. Hills R, et al. (2007) Identification of an ADAMTS-4 cleavage motif using phage display 24. Parmley SF, Smith GP (1988) Antibody-selectable filamentous fd phage vectors: Af- leads to the development of fluorogenic peptide substrates and reveals matrilin-3 as finity purification of target genes. Gene 73(2):305–318. a novel substrate. J Biol Chem 282(15):11101–11109. 25. Fowler DM, Araya CL, Gerard W, Fields S (2011) Enrich: Software for analysis of 49. Gnad F, Baucom A, Mukhyala K, Manning G, Zhang Z (2013) Assessment of compu- protein function by enrichment and depletion of variants. Bioinformatics 27(24): tational methods for predicting the effects of missense mutations in human cancers. 3430–3431. BMC Genomics 14(Suppl 3):S7.

Kretz et al. PNAS | July 28, 2015 | vol. 112 | no. 30 | 9333 Downloaded by guest on September 23, 2021