Amino Acid Scanning at P5′ Within the Bowman‒Birk Inhibitory Loop Reveals Specificity Trends for Diverse Serine Proteases

Choi Yi Li,†∆ Simon J. de Veer,†∆* Andrew M. White,† Xingchen Chen,‡ Jonathan M. Harris,‡ Joakim E. Swedberg,† David J. Craik†*

†Institute for Molecular Bioscience, The University of Queensland, Brisbane QLD 4072, Australia

‡Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane QLD 4059, Australia

∆authors contributed equally to this work

*Corresponding authors:

Professor David J. Craik, Institute for Molecular Bioscience, The University of Queensland, Brisbane QLD 4072, Australia. Email: [email protected]

Dr Simon J. de Veer, Institute for Molecular Bioscience, The University of Queensland, Brisbane QLD 4072, Australia. Email: [email protected]

Abstract

Sunflower inhibitor-1 (SFTI-1) is a 14-amino acid cyclic peptide that shares an inhibitory loop with similar sequence and structure to a larger family of inhibitors, the Bowman‒Birk inhibitors.

Here, we focus on the P5′ residue in the Bowman‒Birk inhibitory loop and produce a library of SFTI- variants to characterize the P5′ specificity of 11 different proteases. We identify seven amino acids that are generally preferred by these enzymes and also correlate with P5′ sequence diversity in naturally occurring

Bowman‒Birk inhibitors. Additionally, we show that several enzymes have divergent specificities that can be harnessed in engineering studies. By optimizing the P5′ residue, we improve the potency or selectivity of existing inhibitors for -related peptidase 5 and show that a variant with substitutions at seven of the scaffold’s fourteen residues retains a similar structure to SFTI-1. These findings provide new insights into P5′ specificity requirements for the Bowman‒Birk inhibitory loop.

- 1 -

Introduction

Serine proteases are widely recognized as having important roles in numerous diseases, including chronic inflammation, coagulation disorders, infectious diseases and cancer.1,2 Accordingly, these enzymes are promising therapeutic targets and an increasing number of serine protease inhibitors have been approved for use in the clinic, including inhibitors of or factor Xa that are used to prevent thrombosis.2,3

Although most of the existing drugs that target serine proteases are small molecule inhibitors, attention is beginning to shift towards polypeptide scaffolds as these molecules provide more opportunities for optimizing the inhibitor’s affinity and selectivity. In cases where the inhibitor scaffold is based on a naturally occurring peptide or protein, there are additional factors to consider as, through the process of evolution, these frameworks have already been optimized and refined over millions of years for activity against their endogenous targets. Accordingly, understanding how each residue in the nature-derived scaffold influences its structure and function is important for guiding the engineering process for new targets.

One peptide scaffold that is becoming increasingly popular for inhibitor engineering is sunflower trypsin inhibitor-1 (SFTI-1).4 As its name suggests, this 14-amino acid cyclic peptide (Figures 1A‒C) is

5,6 produced in sunflower seeds and is a potent inhibitor of trypsin (Ki < 0.1 nM). SFTI-1 also has activity against a range of serine proteases that are structurally similar to trypsin, including ,

G, , matriptase-2 and several kallikrein-related peptidases (KLKs).5-8 The basis for SFTI-1’s potent inhibitory activity has been explored in previous studies, with its cyclic backbone,9-11 intramolecular disulfide bond10,12 and binding loop conformation13 identified as important features. Additionally, the roles of SFTI-1’s key sidechain contacts have been studied using an alanine scan,14,15 contact β-strand (P1-P4) substitutions7 and P2′ substitutions.16 Together, these studies have identified specific residues that are suitable for substitution in engineering studies, and demonstrated the versatility of SFTI-1 as a design scaffold. Although there is now greater clarity on the importance of several features of the SFTI-1 scaffold, certain residues in the binding loop are yet to be thoroughly investigated, including the P5′ residue (Ile10).

- 2 -

Figure 1. SFTI-1, a cyclic peptide that displays the Bowman‒Birk inhibitory loop. (A) Ribbon diagram of the structure of SFTI-1 (green) in complex with bovine β- trypsin (lavender) (PDB ID 1SFI). The P1 and P5′ residues of SFTI-1 are shown in ball and stick representation, together with the disulfide bond between Cys3 and Cys11. (B) Structure of SFTI-1 shown in stick representation (PDB ID 1SFI). (C) Amino acid sequence of SFTI-1 in single letter code (shown in the same orientation as panel B). The position of the P5′ residue relative to P1, P2′ and P4 is illustrated. (D) Ribbon diagram of Medicago scutellata BBI (green) in complex with two separate bovine β-trypsin molecules (lavender and blue) (PDB ID 2ILN). The P1 and P5′ residues in each Bowman‒Birk inhibitory loop at shown in ball and stick representation, together with the disulfide bond.

The binding loop of SFTI-1 has a very similar sequence and structure to a larger family of plant- derived serine protease inhibitors, the Bowman‒Birk inhibitors (BBIs).17,18 Indeed, SFTI-1 was initially proposed to be part of the BBI family. However, recent findings have indicated that the two groups of inhibitors are separate and that replication of the binding loop in BBIs and SFTI-1 is probably a result of convergent evolution.19,20 BBIs are miniproteins (> 8 kDa) that are broadly distributed among plants from several angiosperm lineages, with BBIs from monocots typically containing a single inhibitory loop and

BBIs from dicots containing two inhibitory loops that can each bind to a target protease independently

(Figure 1D). The consensus sequence for the Bowman‒Birk inhibitory loop is CT(R/K)S(I/Y)PP(Q/I)C, which spans the segment from P3 to P6′ (Schechter-Berger nomenclature21). Previous studies have used synthetic nine-residue peptides based on the Bowman‒Birk inhibitory loop to examine the contribution of certain residues to inhibitory activity, including key sidechain contacts (P1 and P2′),22,23 residues involved in intramolecular hydrogen bonding (P2 and P1′)24,25 and the cis Pro8-trans Pro9 segment (P3′-P4′).26

However, the P5′ residue has yet to be studied even though it appears to be a contact residue and shows

- 3 - high sequence diversity in the inhibitory loop of naturally occurring BBIs. Understanding the role of this residue would be useful for engineering studies that are based on inhibitors featuring the Bowman‒Birk inhibitory loop, including SFTI-1 and members of the BBI family, or closely resemble this motif, such as trypsin inhibitors produced by certain frog species.27

In this study, we designed and synthesized a library of SFTI-variants with chemical diversity introduced at the P5′ position to explore the significance of this residue in the Bowman‒Birk inhibitory loop. The library was screened against 11 serine proteases with diverse active site specificities, including human cationic trypsin, bovine β-trypsin, bovine chymotrypsin, , matriptase, thrombin, factor

XIIa and four members of the KLK family. Comparing the specificity profiles for each protease identified a subset of seven residues that were generally preferred by each protease. Divergent P5′ preferences were also found for certain proteases, with interactions for selected inhibitors in complex with different proteases examined using molecular dynamics simulations. Additionally, unique P5′ preferences were used to engineer existing SFTI-based inhibitors to improve their activity and selectivity. Collectively, these findings provide new insights into the role of the P5′ residue in the Bowman‒Birk inhibitory loop and define the potential of this residue for optimization in inhibitor engineering studies.

Results

Design and synthesis of the P5′ inhibitor library

To explore the role of the P5′ residue in the Bowman‒Birk inhibitory loop, we designed an inhibitor library containing 21 SFTI-variants with diverse P5′ residues. The SFTI-variant selected for generating the inhibitor library was a broad-range variant (compound Ile10, Table 1) that we previously used to develop an inhibitor library for screening the S2′ specificity of different serine proteases16 and recently reported its crystal structure in complex with bovine β-trypsin.28 For the P5′ library, each of the canonical amino acids

(excluding Cys to avoid disulfide bond isomers and Pro to avoid having three consecutive Pro residues)

- 4 - were incorporated at the P5′ position of the SFTI-scaffold. We also included three non-canonical amino acids: norleucine (Nle, n), 4,4′-biphenyl-L-alanine (Bip, B) and citrulline (Cit, X).

Each variant in the SFTI library was produced by Fmoc solid phase synthesis, cyclized in solution, then analyzed by LC-MS (Supporting Information, Figure S1), MALDI-TOF MS (Table 1) and 1H 1D

NMR spectroscopy (Figure S2). Generally, the SFTI-variants showed a single peak with correct mass when analyzed by LC-MS. An additional minor peak with the correct mass was also observed for several peptides that contained an aromatic P5′ residue (Phe10, Tyr10 and Bip10). After further HPLC purification, we could isolate each of the two peaks for these peptides. However, analysis of the major peak by LC-MS revealed that it contained two components, which suggests that these peptides potentially have more than one conformation co-existing in equilibrium. This hypothesis was supported by data from 1H 1D NMR spectroscopy experiments. Most SFTI-variants displayed sharp and well-dispersed peaks in the amide spectral region (7–9 ppm), suggesting that these SFTI-variants have one conformation in solution (Figure

S2). However, additional low intensity signals were observed for compounds with aromatic P5′ residues, particularly Phe10 and Tyr10, which could indicate the presence of additional conformations. Additionally, peaks in the amide region for Gly10 were not well dispersed and the number of peaks was increased. This finding suggests that substituting the P5′ residue with Gly increased the flexibility of the peptide backbone, which likely resulted in multiple conformations, as reported previously for SFTI-1 when P5′ Ile was substituted to Gly.29

- 5 - Table 1. Sequences, masses and purity of SFTI-variants

Compound Sequencea Theoretical [M+H]+ Determined [M+H]+ Purity (%)

Ala10 c[GTCTRSIPPACNPN] 1410.63 1410.57 96.5 Asp10 c[GTCTRSIPPDCNPN] 1454.62 1454.56 99.9 Glu10 c[GTCTRSIPPECNPN] 1468.64 1468.58 99.9 Phe10 c[GTCTRSIPPFCNPN] 1486.67 1486.60 99.9 Gly10 c[GTCTRSIPPGCNPN] 1396.62 1396.56 99.9 His10 c[GTCTRSIPPHCNPN] 1476.66 1476.59 99.9 Ile10 c[GTCTRSIPPICNPN] 1452.68 1452.63 99.9b Lys10 c[GTCTRSIPPKCNPN] 1467.69 1467.67 98.3 Leu10 c[GTCTRSIPPLCNPN] 1452.68 1452.66 99.9 Met10 c[GTCTRSIPPMCNPN] 1470.64 1470.65 99.9 Asn10 c[GTCTRSIPPNCNPN] 1453.64 1453.59 99.9 Gln10 c[GTCTRSIPPQCNPN] 1467.66 1467.63 99.9 Arg10 c[GTCTRSIPPRCNPN] 1495.70 1495.67 99.9 Ser10 c[GTCTRSIPPSCNPN] 1426.63 1426.62 99.9 Thr10 c[GTCTRSIPPTCNPN] 1440.64 1440.62 99.9

Val10 1438.67 1438.68 99.9 \c[GTCTRSIPPVCNPN] Trp10 c[GTCTRSIPPWCNPN] 1525.68 1525.68 99.9 Tyr10 c[GTCTRSIPPYCNPN] 1502.66 1502.64 99.8 Nle10 c[GTCTRSIPPnCNPN] 1452.68 1452.70 99.9 Bip10 c[GTCTRSIPPBCNPN] 1562.58 1562.71 99.9 Cit10 c[GTCTRSIPPXCNPN] 1496.68 1496.69 99.9 aPeptides are cyclized head-to-tail between Gly1 and Asn14 and contain an intramolecular disulfide bond between Cys3 and Cys11. Residues substituted from compound Ile10 are shown in bold (n = norleucine, B = 4,4′-biphenyl-L-alanine, X = citrulline). bCharacterization reported in ref. 16

- 6 - Figure 2. Specificity profiles from screening the P5′ SFTI-library against 11 serine proteases. The 21 SFTI-variants in the inhibitor library are represented across the x-axis according to their P5′ residue (in single letter amino acid code, where n = norleucine, B = 4-4′-biphenyl-L-alanine and X = citrulline). For each enzyme, inhibitors were screened at the same concentration (indicated in the panel title) and the % inhibition is shown on the y-axis. Data is expressed as the mean ± SEM from three independent experiments

- 7 - performed in duplicate. Assay conditions for each enzyme, including concentrations of enzyme and substrate, are provided in Table S3 and an analysis of protease sequence and structural similarity is provided in Table S4. The panel “Average Preference” indicates the activity of each variant averaged across all enzymes (thrombin was excluded from this analysis due to its markedly different specificity profile where Ile, Nle and His were almost exclusively preferred). Values were determined by normalizing the data for each enzyme to the most preferred variant (set as 100%), then calculating the average across the ten enzymes, as described previously.16

Characterizing the specificity of trypsin and chymotrypsin using the P5′ SFTI-library

The first enzymes screened against the P5′ library were the prototypic serine proteases, bovine β- trypsin and bovine chymotrypsin. Overall, the specificity profiles for these two proteases indicated that they preferred similar residues at P5′ (Figure 2). For each enzyme, the most preferred P5′ residue was Gln, whereas Nle, Met, Leu and Ile were all in the top-third of residues for both proteases. Additionally, Gly and acidic residues (Asp and Glu) were the three least preferred residues for both enzymes. We also screened human cationic trypsin as, in a recent study, we showed that this enzyme has a different binding specificity to bovine β-trypsin at another primed subsite (S2′).30 In the P5′ library, Gln was also the most preferred residue for human cationic trypsin and Nle, Met, Leu and Ile were all well-tolerated. Basic residues (Arg and Lys) were among the least favored residues for human cationic trypsin, whereas Arg was the second most preferred residue for bovine β-trypsin. Additionally, human cationic trypsin was more tolerant of acidic residues (Asp preferred over Glu). Together, these findings indicate that human cationic trypsin and bovine β-trypsin have several similarities, but also key differences, in their P5′ specificities.

To explore the contrasting P5′ preferences for bovine β-trypsin (basic residues) and human cationic trypsin (acidic residues), we produced models of each protease-inhibitor complex and performed molecular dynamics simulations. Simulations for human cationic trypsin revealed that P5′ Asp was in close proximity to a surface-exposed Arg residue (Arg96, chymotrypsin numbering), indicating that the preference for acidic residues is due to a favorable electrostatic interaction (Figure S3). Additionally, the presence of Arg in the S5′ subsite likely explains the weak activity of SFTI-variants containing P5′ Arg or Lys against human

- 8 - cationic trypsin. By contrast, the S5′ subsite for bovine β-trypsin has a different biophysical profile.

Sequence divergence at several positions, including Ser96, leaves space for P5′ Arg to extend into S5′ subsite and form potential hydrogen bonds with the backbone carbonyl of His57 and/or Tyr59 (Figure S3).

We also performed simulations to examine the shared preference for P5′ Gln. For human cationic trypsin, the Gln sidechain in SFTI-Gln10 was in a similar orientation to P5′ Gln in the crystal structure of the

Medicago scutellata Bowman‒Birk inhibitor in complex with trypsin and appeared to form a hydrogen bond with the backbone carbonyl of His57.31 The position of this residue is strictly conserved in

(chymo)trypsin-like serine proteases as His57 is part of the enzyme’s catalytic triad.

Members of the kallikrein-related peptidase family show unique P5′ specificity profiles

We next extended the library screen to other proteases that are structurally similar to trypsin by studying the P5′ specificity of several members of the KLK family. For these assays, we screened three

KLKs that have trypsin-like P1 specificity (KLK4, KLK5 and KLK14) together with one KLK that has chymotrypsin-like P1 specificity (KLK7). Interesting, the most preferred P5′ residue for each KLK was unique. For KLK7 and KLK14, the most potent SFTI-variant had a hydrophobic residue at P5′ (Ile and Nle, respectively), whereas Gln was the most preferred residue for KLK4 (Figure 2). Despite this, there were several residues that were well-tolerated by all three proteases, including Bip, Phe, Nle, Met and Gln, which were in the top-third of residues for KLK4, KLK5 and KLK14. Additionally, Gly was not favored and acidic P5′ residues were among the weakest inhibitors for KLK4 and KLK7, with basic residues showing higher activity against these enzymes. By contrast, basic residues and acidic residues appeared to be similarly tolerated by KLK14. Interestingly, KLK5 displayed a different specificity profile to the other

KLKs, with acidic residues being most preferred, followed by aromatic residues (Tyr, Bip, Trp and Phe).

Additionally, basic residues were disfavored, with Lys and Arg (together with Gly) among the least preferred residues for KLK5. Collectively, these findings identify several substitutions that can potentially be used to improve the potency and selectivity of existing SFTI-based inhibitors that target KLK proteases.

- 9 - Extensions in surface loop II influence the P5′ specificity of matriptase, factor XIIa and thrombin

A distinctive feature of several (chymo)trypsin-like proteases is the presence of additional residues in loop II. In the prototypic serine proteases trypsin and chymotrypsin, loop II is relatively short and spans residues 59‒63. However, matriptase, factor XIIa and thrombin each have extensions in loop II that range from four to nine amino acids. The structure of matriptase in complex with SFTI-1 has been determined by

X-ray crystallography and shows that loop II is in close proximity to the P5′ residue.32 Indeed, the additional residues in loop II provide new potential binding contacts between matriptase and a bound SFTI- variant.6,8,32 Accordingly, we screened the P5′ SFTI-library against matriptase, factor XIIa and thrombin to determine whether proteases with loop II insertions had different specificity profiles.

In the matriptase/SFTI-1 structure, the S5′ subsite of matriptase is electronegative with contributions from Asp60B (loop II extension) and Asp96. Screening the P5′ library against matriptase revealed that it strongly favored basic residues, with Arg most preferred, followed by Lys and His. The preference for Arg and Lys appeared to be related to charge as the activity of the P5′ Cit variant was considerable weaker than

Arg. Hydrophobic residues (Nle and Leu) were also tolerated by matriptase and acidic residues were the least favored. These findings are consistent with a previous study showing that substituting P5′ Ile in SFTI-

1 with Arg or Lys led to an improvement in activity against matriptase,6,8 whereas a variant with P5′ Asp showed markedly weaker activity.6 For factor XIIa, Arg was also the most-preferred P5′ residue, followed by hydrophobic residues (Nle, Leu and Ile), His and Tyr. The recently reported structure of factor XIIa indicates that Asp60A is well-positioned to interact with P5′ Arg via a salt bridge, as seen in molecular dynamics simulations (Figure S4), whereas the hydrophobic surface of loop IV likely mediates interaction with Nle, Leu or Ile. Thrombin has a characteristic insertion in loop II (named the thrombin loop) which narrows the protease’s active site cleft and contributes to its restricted specificity.33 Unlike other proteases screened against the SFTI-library, thrombin displayed a strong preference for Ile, with only two additional variants (P5′ His or Nle) showing appreciable activity. Molecular dynamics simulations indicated that the thrombin loop, particularly Tyr60A and Trp60D, creates a shallow S5′ pocket that accommodates P5′ Ile

- 10 - (Figure S4). However, it is likely that other factors contribute to thrombin’s narrow specificity as residues with similar chemical properties, including Val and Leu, were clearly outperformed by Ile.

Substituting the P5′ residue in existing SFTI-based inhibitors for KLK5 improves their activity and selectivity

Comparing specificity profiles across the P5′ library revealed that a number of residues were generally preferred by the proteases included in the inhibitor screen. Indeed, when the activity data for each protease was normalized, we identified seven residues that were broadly tolerated (Gln, Ile, Leu, Met, Nle,

Phe and Bip) based on the criteria of showing more than 75% relative activity against at least eight of the ten proteases included in the analysis (Figure S5, illustrated as Average Preference in Figure 2). Since the range of residues that were tolerated at P5′ was relatively broad, we hypothesized that the P5′ residue is amenable to optimization when engineering SFTI-based inhibitors. One protease that appeared to be particularly suitable for testing P5′ substitutions was KLK5 as it has proven to be challenging to design highly selective SFTI-variants for this enzyme.34 KLK5 is a key protease in epidermal homeostasis and has emerged as a therapeutic target in prevalent and rare skin pathologies, including atopic dermatitis35 and

Netherton syndrome.36,37 Further studies have shown that KLK5 is also a protease of interest in ovarian cancer,38 oral squamous cell carcinoma39 and influenza.40 Our data from the SFTI-library screen indicated that KLK5 preferred acidic residues at P5′ (Asp and Glu), which was different to other KLKs (Figure 2).

Therefore, we selected two previously reported SFTI-variants with different P4-P1 sequences34 and substituted P5′ Ile with either Asp or Glu to generate compounds 1‒4 (Table 2).

The P4‒P1 sequence of compounds 1 and 2 is YCNR, which is based on a favored KLK5 substrate sequence.34 The effect of substituting P5′ Ile with Asp or Glu was examined by performing competitive inhibition assays against KLK5 and determining the inhibition constant (Ki). The parent compound (P5′ Ile) is a potent KLK5 inhibitor, but has only 12-fold selectivity over KLK14.34 Compounds 1 and 2 displayed improved activity against KLK5 by 5.8-fold and 12.4-fold, respectively, with P5′ Glu (2) producing the

- 11 - most potent inhibitor (Ki = 0.34 ± 0.04 nM, Table 2). Accordingly, we assessed the selectivity of 2 against closely-related KLKs (KLK7 and KLK14), human cationic trypsin and matriptase. These assays revealed that 2 displayed 53-fold selectivity over KLK14 and weak activity against KLK7, trypsin and matriptase

(Table 2). Thus, substituting P5′ Ile with Glu improved both the potency and selectivity of the existing

KLK5 inhibitor.

Table 2. Inhibition constants (Ki) for SFTI-based inhibitors targeting KLK5

Ki (nM) Human Sequencea KLK5 KLK7 KLK14 Matriptase trypsin

b c b d e SFTI-1 c[GRCTKSIPPICFPD] 143 IC50 > 10,000 25 0.03 200

1 c[GYCNRSYPPDCFPN] 0.73 ± 0.02 - - - -

2 c[GYCNRSYPPECFPN] 0.34 ± 0.04 IC50 > 10,000 18 ± 0.2 IC50 > 2,000 IC50 > 10,000

3 c[GFCHRSYPPDCFPN] 8.9 ± 0.3 - - - -

4 c[GFCHRSYPPECFPN] 3.7 ± 0.2 IC50 > 10,000 79 ± 1.3 IC50 > 10,000 IC50 > 10,000

5 c[GFCHRSYPPECWPN] 2.4 ± 0.08 IC50 > 5,000 150 ± 4.2 IC50 > 10,000 IC50 > 10,000 aAll peptides are cyclized head-to-tail between Gly1 and Asp14 (or Asn14), and contain an intramolecular disulfide bond between Cys3 and Cys11. Residues substituted from SFTI-1 are underlined and the P5′ residue is shown in bold. bData reported in ref. 13 cData reported in ref. 57 dData reported in ref. 30

We next tested P5′ substitutions in a second SFTI-variant that has the P4‒P1 sequence FCHR, which is based on the binding loop of a skin-expressed Kazal-type inhibitor that is selective for KLK5.41,42

Substituting P5′ Ile with Asp or Glu produced compounds 3 and 4, respectively. These SFTI-variants were slightly less potent against KLK5 than 1 and 2, with P5′ Glu (4) producing the most potent inhibitor (Ki =

- 12 - 3.7 ± 0.2 nM, Table 2). Additionally, 4 displayed 21-fold selectivity over KLK14 and less than 50% inhibition at 10 µM against KLK7, trypsin and matriptase. In a previous study on SFTI-based KLK5 inhibitors, we showed that Phe12 can be substituted with Trp to improve the inhibitor’s activity.13

Accordingly, we produced an additional inhibitor (5) that was based on 4 and contained the additional

Phe12 to Trp substitution. The parent compound for this inhibitor (P5′ Ile, Trp12) shows activity against both KLK14 (28-fold selectivity) and KLK7 (23-fold selectivity).13 Substituting P5′ Ile to Glu (5) produced a 2.6-fold improvement in activity against KLK5 (Ki = 2.4 ± 0.08 nM, Table 2) that was accompanied by a marked loss in activity against KLK7 (less than 50% inhibition at 5 µM). Compound 5 also displayed 62- fold selectivity over KLK14 and weak activity against trypsin and matriptase.

Engineered SFTI-variants with P5′ substitutions display similar structural characteristics to SFTI-1

Compared to SFTI-1, the newly generated KLK5 inhibitors have amino acid substitutions at six (2,

4) or seven (5) residues distributed across the SFTI-scaffold. Accordingly, we analyzed each peptide by

NMR spectroscopy to study their structural characteristics. Data from TOCSY and NOESY experiments were used to assign spin systems for each amino acid and secondary Hα chemical shifts were subsequently determined by subtracting reported random coil Hα shifts43 from the experimentally measured shifts derived from spectral assignment. Overall, the secondary Hα shifts for each of the three SFTI-variants were similar to SFTI-1 (Figure 3). Local differences were observed at Cys3 Hα where each peptide displayed an upfield shift of 0.5‒0.6 ppm. In these compounds, both residues that flank Cys3 had been substituted (Arg2 and

Thr4) and, in other variants, we have observed a similar upshift shift for Cys3 Hα when substitutions were performed at these positions. Additionally, an upfield shift of 0.8 ppm was observed for Pro8 Hα.

Compounds 2, 4 and 5 each have Tyr at P2′ and, in previous studies, an upfield shift for Pro8 Hα was also observed when an aromatic residue was present at P2′.30

- 13 - Figure 3. Secondary Hα chemical shift analysis for selected SFTI-based inhibitors targeting KLK5. Secondary Hα chemical shifts were calculated for each residue by subtracting the reported random coil Hα chemical shift43 from the observed Hα chemical shift. For residues that preceded a Pro residue, separate random coil values were used for each amino acid to account for the influence of Pro on Hα chemical shifts.43 Values for SFTI-1 are shown in green and the sequence of SFTI-1 is shown on the x-axis in green text. The seven residues that were substituted in one or more of the SFTI-variants are enclosed by a square and the corresponding residue in compound 2, 4 and 5 is shown below. Values for these peptides are shown in purple (2), orange (4) or blue (5).

We subsequently focused on 5 as this variant had the highest number of substitutions and performed additional NMR experiments to characterize its three-dimensional solution structure. TOCSY, NOESY,

1H‒13C HSQC and 1H‒15N HSQC experiments were performed and initial structures were subsequently generated using CYANA.44 These structures were optimized using CNS protocols,45 generating 100 structures, from which a final ensemble of 20 was selected based on lowest energy, violations less than 0.3

Å or 2° and MolProbity statistics.46 Overall, the backbone conformation of 5 overlaid closely with the previously reported NMR structure of SFTI-1 (PDB ID 1JBL),9 particularly across the inhibitory loop

- 14 - Figure 4. Compound 5 retains a similar backbone structure to SFTI-1. Overlay of the solution structure of compound 5 (blue) with the solution structure of SFTI-1 (green, PDB ID 1JBL). A backbone trace (N, Cα and C atoms) is shown for each peptide and residues 1, 3, 5 (P1), 10 (P5′) and 12 are labelled at the position of the corresponding Cα atom.

(residues 3‒10) as shown in Figure 4. Minor differences were observed at residue 12, which was potentially due to subtle movements required to accommodate the bulky Trp residue in place of Phe. Additionally, the position and orientation of Gly1 was different in 5 compared to SFTI-1. This result may reflect that Gly1 is flanked by different residues in 5 (Phe and Asn replace Arg2 and Asp14, respectively). Additionally, fewer restraints were used in structure calculations for 5 compared to SFTI-1. We also performed molecular dynamics simulations to examine the potential binding interactions between 5 and KLK5. These analyses indicated that P5ʹ Glu was positioned to engage Lys60 in the KLK5 active site via an electrostatic interaction (Figure 5A‒B). Additional favorable interactions at other positions, including P2ʹ Tyr, P2 His and P4 Phe, did not appear to be affected by the Ile to Glu substitution at P5ʹ. Simulations were also performed for KLK5 in complex with 2. Although this inhibitor has different substitutions at several positions, including at P4 (Tyr) and P2 (Asn), it appeared that the predicted electrostatic interaction between

P5ʹ Glu and Lys60 was maintained (Figure 5C‒D).

Discussion and Conclusions

In this study we explored the role of the P5′ residue in the Bowman‒Birk inhibitory loop by synthesizing and screening a library of cyclic peptides based on SFTI-1. We profiled the P5′ specificity of

11 serine proteases and identified a subset of seven amino acids (Gln, Ile, Leu, Met, Nle, Phe and Bip) that

- 15 - Figure 5. Exploring the interaction of SFTI-variants optimized at the P5′ position with KLK5 using molecular dynamics simulations. Models of each complex were generated by performing cluster analysis on three independent simulation trajectories, with the graphic illustrating a representative frame from the largest cluster. The panels show KLK5 in complex with (A‒B) the most selective inhibitor (compound 5) and (C‒D) the most potent inhibitor (compound 2). In panels A and C, the inhibitor is shown in stick model (carbon atoms: green, nitrogen atoms: blue, oxygen atoms: red, sulfur atoms: yellow) and KLK5 is shown as a ribbon diagram (lavender) with Lys60 shown in stick model (carbon atoms: lavender). For clarity, the P5′ residue (Glu10) is labeled. Panels B and D show the protease/inhibitor complex in the same orientation as panel A or C, with KLK5 shown as a surface representation (colored by electrostatic potential – blue: positive, red: negative) and the inhibitor shown in stick model.

is generally preferred by these enzymes. Interestingly, several of these amino acids (Gln, Ile and Leu) are also among the most prevalent P5′ residues in naturally occurring BBIs, indicating that sequence evolution at this position may have been driven by a need to achieve broad target selectivity. However, several proteases also displayed unique P5′ preferences, including KLK5, matriptase and thrombin. Focusing on

KLK5, we examined whether substitutions at P5′ could improve the potency and selectivity of existing

- 16 - SFTI-based inhibitors for this protease. Substituting P5′ Ile with Glu improved the activity and selectivity of two SFTI-variants with different binding loop sequences and we showed by NMR spectroscopy that these peptides retained a similar structure to SFTI-1, despite having amino acid substitutions at up to seven of the scaffold’s fourteen residues.

The main finding from this study is a definition of the range of amino acids that are tolerated at the

P5′ position in the Bowman‒Birk inhibitory loop. We included 11 enzymes in the library screen and comparing their specificity profiles revealed that Gln was the optimal P5′ residue for the highest number of proteases (five), followed by Ile and Arg (two). However, there was also a subset of seven amino acids

(Gln, Ile, Leu, Met, Nle, Phe and Bip) that appeared to be favored by most proteases (Figure 2, Figure S5), indicating that a range of residues with different biophysical properties (polar, aliphatic or aromatic) are broadly tolerated at P5′. In a recent study, we used another SFTI-based library to identify preferred amino acids at the P2′ position in the Bowman‒Birk inhibitory loop.16 That library screen found only two amino acids (Ile and Val) that were broadly tolerated and most proteases displayed unique specificity profiles.

SFTI-1 and BBIs not only share the same binding loop structure, they also belong to a larger class of protease inhibitors that are termed Laskowski (or standard mechanism) inhibitors.47,48 These inhibitors have a characteristic binding loop called the canonical loop that has a highly conserved conformation from P3 to

P2′, even though the various inhibitors are unrelated in terms of overall structure.48 Figure 6 shows an overlay of eight structurally diverse Laskowski inhibitors (determined by X-ray crystallography in complex with various trypsin-like proteases) and illustrates that the position of the P2′ residue is highly conserved in inhibitors beyond SFTI-1 and BBIs. Accordingly, the P2′ residue is ideally positioned to interact with the canonical S2′ subsite of the target protease, as seen in the respective crystal structures. This subsite is well-defined in the active site cleft and has evolved to contribute to the recognition of specific substrates and inhibitors by individual proteases.49 By contrast, the position of the P5′ residue in both SFTI-1 and

BBIs is different to other Laskowski inhibitors (Figure 6). Indeed, structures of each protease/inhibitor complex indicate that the P5′ residue is not typically a close binding contact, except in inhibitors that feature

- 17 - Figure 6. The Bowman‒Birk inhibitory loop places the P5′ residue in a different position to other Laskowski inhibitors. Overlay of the structures of eight Laskowski inhibitors with different folds, determined by X-ray crystallography in complex with a trypsin-like protease. Structures of SFTI-1 (PDB ID 1SFI) and Medicago scutellata BBI (PDB ID 2ILN) are shown by displaying a ribbon trace of the peptide backbone (green) with the P1, P2′ and P5 residues shown in stick model (carbon atoms: green). The positions of the P1, P2′ and P5′ residues in bovine pancreatic trypsin inhibitor (aprotinin, PDB ID 3FP6), Kunitz-type soybean trypsin inhibitor (PDB ID 1AVW), eglin C (PDB ID 1ACB), turkey ovomucoid third Kazal domain (PDB ID 1CHO), Momordica cochinchinensis trypsin inhibitor-II (PDB ID 4GUX) and hirustasin (PDB ID 1HIA) are illustrated in stick model (carbon atoms: grey).

the Bowman‒Birk inhibitory loop. This observation suggests that the region of the active site where the P5′ residue of BBIs interacts has a different role in substrate and inhibitor recognition compared to the P2′ residue, and potentially explains why the number of broadly tolerated residues in the P5′ SFTI-library screen is higher than the previous P2′ screen.

The P5′ amino acid preferences identified in this study are also relevant to BBI evolution and understanding the rationale for sequence diversity at P5′ in naturally occurring BBIs. In SFTI-1, the P5′ residue is located after two Pro residues (P3′‒P4′) and is followed by a Cys residue that is involved in an intramolecular disulfide bond (Figure 1). These two features are highly conserved in BBIs, as seen in an alignment of inhibitory loop sequences reported in a recent study on BBI evolution.20 Indeed, P3′ Pro and

P6′ Cys are present in almost all BBIs (Figure 7), whereas P4′ Pro is occasionally replaced by a small amino acid (Ala or Gly). By contrast, the P5′ residue displays considerable sequence diversity, with Gln, Ile, Lys

- 18 - Figure 7. Sequence diversity in the Bowman‒Birk inhibitory loop. Inhibitory loop sequences were compiled from a recently reported BBI sequence alignment20 with diversity at each position (indicated across the x-axis) illustrated using a sequence logo graphic (relative frequency indicated on the y-axis).

and Leu all well-represented in naturally occurring BBIs. Interestingly, the above mentioned study highlighted the P5′ residue as the only residue in the Bowman‒Birk inhibitory loop that has yet to be investigated using synthetic peptides to determine its contribution to protease inhibition.20 Correlating the

BBI sequence alignment with activity data from our SFTI-library revealed that the most prevalent amino acids in naturally occurring BBIs were also among the most broadly tolerated residues in the inhibitor library screen. In the BBI alignment, the most prevalent P5′ residue was Gln (approximately 40% of BBI sequences) and we found that Gln was among the most broadly tolerated P5′ residues in the SFTI-library screen. Indeed, Gln was the optimal P5′ residue for five different proteases (bovine β-trypsin, human cationic trypsin, bovine chymotrypsin, cathepsin G and KLK4). The P5′ residue in SFTI-1 (Ile) was the second most abundant P5′ residue in the BBI alignment. In the SFTI-library screen, Ile was also highly favored (optimal residue for KLK7 and thrombin), with Nle and Leu typically showing similar activity to

Ile. Interestingly, Lys, Asp and Arg were also present at P5′ in the BBI alignment, which may reflect the contrasting specificities we observed for certain digestive enzymes, including bovine β-trypsin (basic residues preferred) and human cationic trypsin (acidic residues preferred). Collectively, these findings suggest that P5′ sequence diversity in BBIs may be related to the goal of achieving multi-target inhibition.

Although certain amino acids were broadly tolerated at P5′ by most proteases examined in this study, we also identified several enzymes that displayed unique specificity profiles. This observation indicated that the P5′ residue can be targeted for optimization in engineered SFTI-variants. To test this hypothesis,

- 19 - we focused on KLK5 since the P5′ library screen revealed that its most preferred residues were Asp and

Glu, which were not favored by closely-related KLKs. We subsequently substituted the P5′ residue in two

SFTI-based KLK5 inhibitors that have different sequences across the contact β-strand (P4‒P1)13,34 and found that the new variants displayed either improved activity or selectivity for KLK5. Additionally, NMR spectroscopy experiments revealed that substitutions at P5′ were structurally well-tolerated and characterizing the solution structure of the most selective KLK5 inhibitor (compound 5) revealed that it largely retained the structure of SFTI-1, despite having substitutions at seven of the scaffold’s fourteen residues. Another enzyme that displayed a unique specificity profile was matriptase, which preferred Arg and Lys. These findings were consistent with previous studies that produced potent matriptase inhibitors by examining substitutions at different positions in the SFTI-scaffold, including selected substitutions at

P5′.6,8 In these studies, Arg was also found to be the most preferred P5′ residue, followed by Lys, whereas

Gly and Asp were strongly disfavored. In the examples of KLK5 and matriptase, P5′ Ile was substituted with either acidic or basic residues to optimize the activity of separate SFTI-variants. More broadly, 1H 1D

NMR spectra for members of the SFTI-library indicate that an even larger range of amino acids can be accommodated at P5′ in the SFTI-scaffold, if required in future engineering studies. For most SFTI-variants, peaks in the amide spectral region (7‒9 ppm) were sharp and well-dispersed, indicating that most variants had well-ordered structures in solution. The major exception occurred when Gly occupied P5′, although this finding was consistent with a previous study that showed a similar effect when Gly was substituted in place of P5′ Ile in SFTI-1.29 Although most residues appear to be suitable for substitution at P5′, we note that not all proteases display unique preferences that will assist with achieving higher selectivity.

In conclusion, our study demonstrates the adaptability of the P5′ residue in the Bowman‒Birk inhibitory loop using inhibitor variants based on SFTI-1, a cyclic peptide scaffold that attracts considerable interest for inhibitor engineering. We identified a subset of seven residues that were tolerated by most enzymes included in the library screen. Interestingly, these residues also correspond to the most prevalent amino acids found at the P5′ position in naturally occurring BBIs. Additionally, we identified several

- 20 - enzymes that display unique preferences at P5′, indicating that the P5′ residue has the potential to be optimized in SFTI-based engineering studies. We subsequently produced several new SFTI-variants for

KLK5, a therapeutic target in skin pathologies and various cancers, by substituting the P5′ residue in existing engineered inhibitors and demonstrated that the new variants showed improved activity or selectivity.

Experimental Section

Reagents and proteins

Fmoc N-protected amino acids and coupling reagents for peptide synthesis were obtained from CSBio and

Auspep, respectively. Analytical solvents for reverse-phase HPLC were from Merck unless stated otherwise and deuterated solvents (D2O, acetonitrile-d3) for NMR spectroscopy were from Cambridge Isotope

Laboratories. Proteases were sourced from Sigma-Aldrich (bovine chymotrypsin, bovine β-trypsin and human cationic trypsin), R&D Systems (recombinant human matriptase) and Molecular Innovation (human cathepsin G, human α-thrombin and human β-factor XIIa). Recombinant human KLKs were expressed in zymogen form in Pichia pastoris strain X-33 or Spodoptera frugiperda Sf9 cells, then purified and activated as described previously.13 Synthetic peptides substrates not synthesized in-house were obtained from

Peptide Institute (Boc-QAR-MCA, Boc-VPR-MCA and Suc-AAPF-MCA,).

Peptide synthesis and validation

SFTI-based inhibitors were synthesized using a Symphony automated peptide synthesizer (Protein

Technologies, Inc.) using Fmoc solid-phase chemistry on 2-chlorotrityl chloride resin (0.9 mmol/g,

Chempep Inc.), as reported for previous SFTI-libraries.16 Peptide para-nitroanilide (pNA) substrates were synthesized manually on 2-chlorotrityl resin as previously described.50 All peptides were purified by HPLC

- 21 -

(Shimadzu Prominence) using a reverse-phase C18 column. SFTI-variants were purified two times (once after sidechain deprotection and then after formation of the disulfide bond by air oxidation) using a preparative column (Zorbax 300SB-C18 PrepHT, 21.2 × 250 mm, 7 µm) and a linear gradient of 10-50% acetonitrile/0.05% TFA. Peptide pNA-substrates were purified after sidechain deprotection. Peptides were validated by LC-MS using a Shimadzu Prominence system equipped with an Agilent Zorbax 300SB-C18

(5 µm) column (150 × 2.1 mm). High resolution mass spectrometry was subsequently performed using a

SCIEX 5800 TOF/TOF instrument (AB Sciex). Products with greater than 95% purity were lyophilized and stored at ‒20ºC.

NMR spectroscopy

Each peptide in the P5′ SFTI-library (Table 1) was dissolved in 550 µL of H2O/D2O (10:1) with 4,4- dimethyl-4-silapentane-1-sulfonic acid (DSS) included as an internal reference (0 ppm). All spectra were acquired at 298 K using a Bruker Avance III 600 MHz NMR spectrometer equipped with a cryoprobe.

NMR experiments (1H 1D, TOCSY and NOESY) for engineered KLK5 inhibitors (1‒5) were carried out in acetonitrile-d3/H2O (3:7) as some of these peptides were less water soluble than peptides in the SFTI- library. Spectra used in 3D structure calculations were acquired in H2O/D2O (9:1). Mixing times were 80 ms for TOCSY experiments and 200 ms for NOESY experiments. A 3D NMR structure of 5 was generated by torsional angle dynamics using distance restraints derived from NOE intensities in the program

CYANA.44 φ- and ψ-dihedral angle restraints were predicted from the HN, Hα, Cα, Cβ chemical shifts

51 3 using TALOS. χ1-angles were incorporated based on NOE intensities of HN-Hβ1/2, Hα-Hβ1/2 and JHα-Hβ1/2 derived from ECOSY experiments. No prominent solvent protected amides were observed in D2O exchange experiments, therefore no H-bond restraints were included in structure calculations. The final ensemble of structures was generated using CNS protocols,45 from which 20 structures were selected based on violations less than 0.3 Å or 2° and the best MolProbity scores.46 Experimental restraints and structural statistics are

- 22 - summarized in Table S1 and chemical shifts are reported in Table S2. The final 20 structures have been deposited in the PDB (ID 6NOX) and BMRB (ID 30562).

Protease inhibition assays

SFTI-variants were assessed using competitive inhibition assays with either a colorimetric (peptide para- nitroanilide, pNA) or fluorescent (peptide 4-methylcoumaryl-7-amide, MCA) substrate. Assays to screen the SFTI-library were performed in low-binding 96-well plates (Corning) using a fixed concentration of protease, substrate and inhibitor. Specific conditions for each protease are listed in Table S3. Enzymatic activity was determined by monitoring cleavage of the pNA or MCA moiety using a Tecan M1000 Pro plate reader (pNA: absorbance, λ = 405 nm, reading interval = 15 s, assay time course = 10 min; MCA: fluorescence: λex = 360 nm, λem = 460 nm, reading interval = 60 s, assay time course = 10 min). The percentage inhibition for each SFTI-variant was calculated using the kinetic rate with inhibitor compared to the kinetic rate in control assays with enzyme and substrate only. Assays were performed three times in duplicate. Assays to determine the inhibition constant (Ki) were performed in a similar way except that a serial dilution of inhibitor (eight concentrations) was tested. Ki values were subsequently determined by non-linear regression (Morrison equation for tight binding inhibitors) in GraphPad Prism 7 using data from three experiments performed in triplicate.

Molecular dynamics simulations

Existing structures of proteases in complex with SFTI-1 were used to generate models of selected SFTI- variants in complex with bovine β-trypsin (PDB ID 1SFI) or matriptase (PDB ID 3P8F). For remaining proteases, a model of the protease/SFTI-1 complex was first generated by overlaying the structure of each protease with the trypsin/SFTI-1 complex (PDB ID 1SFI): human cationic trypsin (PDB ID 2RA3, with

Ala195 mutated back to the catalytic serine), KLK5 (PDB ID 2PSX), FXIIa (PDB ID 6B74) and thrombin

- 23 - (PDB ID 3VXE). Complexes were solvated with TIP3P water and the system was neutralized with Na+ and Cl- counter ions using VMD 1.9.2 (URL: http://www.ks.uiuc.edu/Research/vmd/).52 Each complex was subjected to stepwise equilibration, as described in recent studies,16,30 before production simulations (3 ×

10 ns) were performed using NAMD 2.11 (URL: http://www.ks.uiuc.edu/Research/namd/)53 with

CHARMM22 force field parameters.54 Detailed methods for equilibration steps and production simulations have been described previously.16 Graphics showing the modelled complex for each protease/SFTI-variant were generated by cluster analysis using Chimera 1.11.2 (URL: http://www.rbvi.ucsf.edu/chimera)55 and selecting a representative frame from the largest cluster (selected by the software). Figures were subsequently produced using CCP4MG (URL: http://www.ccp4.ac.uk/MG/).56

Analysis of sequence diversity within the Bowman‒Birk inhibitory loop

Sequence diversity in the Bowman‒Birk inhibitory loop was examined using a BBI sequence alignment reported in a recent study (available online in ref. 20 as Supplemental File 1). For each BBI, the sequence of each inhibitory loop segment was extracted and the compiled sequences were submitted to WebLogo 3.6

(URL: http://weblogo.threeplusone.com) to generate a sequence logo illustrating diversity at each position.

Ancillary Information

Supporting information

The supporting information is available free of charge on the ACS Publications website at DOI:

LC-MS traces, 1H 1D NMR spectra, molecular modeling, structural statistics and chemical shifts

for compound 5 and enzyme-specific assay conditions (PDF)

- 24 - Accession codes

Structural coordinates have been deposited in the PDB (5, PDB ID 6NOX). Authors will release the atomic coordinates and experimental data upon article publication.

Author information

Corresponding authors

Email: [email protected] (D.J.C.), Email: [email protected] (S.J.D.)

Author contributions

C.Y.L and S.J.D. contributed equally to this work. All authors provided input on drafting the manuscript.

Acknowledgements

This work was funded by a grant from the Australian Research Council (ARC, DP150100443). S.J.D. and

J.E.S. were funded by National Health and Medical Research Council (NHMRC) Early Career Fellowships

(GNT1120066 and GNT1069819) and D.J.C. is an ARC Australian Laureate Fellow (FL150100146). We thank Dr Peta Harvey for assistance with NMR experiments and Olivier Cheneval for synthesis support.

Abbreviations used

BBI, Bowman‒Birk inhibitor; Bip, 4,4′-biphenyl-L-alanine; Cit, citrulline; KLK, kallikrein-related peptidase; MCA, 4-methylcoumaryl-7-amide; Nle, norleucine; pNA, para-nitroanilide; SFTI-1, sunflower trypsin inhibitor-1

- 25 -

References

1. Bachovchin, D. A.; Cravatt, B. F. The pharmacological landscape and therapeutic potential of serine

hydrolases. Nat. Rev. Drug Discov. 2012, 11, 52-68.

2. Drag, M.; Salvesen, G. S. Emerging principles in protease-based drug discovery. Nat. Rev. Drug

Discov. 2010, 9, 690-701.

3. Mullard, A. 2013 FDA drug approvals. Nat. Rev. Drug Discov. 2014, 13, 85-89.

4. Chaudhuri, D.; Aboye, T.; Camarero, J. A. Using backbone-cyclized Cys-rich polypeptides as

molecular scaffolds to target protein-protein interactions. Biochem. J. 2019, 476, 67-83.

5. Luckett, S.; Garcia, R. S.; Barker, J. J.; Konarev, A. V.; Shewry, P. R.; Clarke, A. R.; Brady, R. L.

High-resolution structure of a potent, cyclic proteinase inhibitor from sunflower seeds. J. Mol. Biol.

1999, 290, 525-533.

6. Quimbar, P.; Malik, U.; Sommerhoff, C. P.; Kaas, Q.; Chan, L. Y.; Huang, Y. H.; Grundhuber, M.;

Dunse, K.; Craik, D. J.; Anderson, M. A.; Daly, N. L. High-affinity cyclic peptide matriptase

inhibitors. J. Biol. Chem. 2013, 288, 13885-13896.

7. Swedberg, J. E.; Nigon, L. V.; Reid, J. C.; de Veer, S. J.; Walpole, C. M.; Stephens, C. R.; Walsh, T.

P.; Takayama, T. K.; Hooper, J. D.; Clements, J. A.; Buckle, A. M.; Harris, J. M. Substrate-guided

design of a potent and selective kallikrein-related peptidase inhibitor for kallikrein 4. Chem. Biol.

2009, 16, 633-643.

8. Fittler, H.; Avrutina, O.; Glotzbach, B.; Empting, M.; Kolmar, H. Combinatorial tuning of peptidic

drug candidates: high-affinity matriptase inhibitors through incremental structure-guided

optimization. Org. Biomol. Chem. 2013, 11, 1848-1857.

9. Korsinczky, M. L.; Schirra, H. J.; Rosengren, K. J.; West, J.; Condie, B. A.; Otvos, L.; Anderson, M.

A.; Craik, D. J. Solution structures by 1H NMR of the novel cyclic trypsin inhibitor SFTI-1 from

sunflower seeds and an acyclic permutant. J. Mol. Biol. 2001, 311, 579-591.

- 26 - 10. Zablotna, E.; Kazmierczak, K.; Jaskiewicz, A.; Stawikowski, M.; Kupryszewski, G.; Rolka, K.

Chemical synthesis and kinetic study of the smallest naturally occurring trypsin inhibitor SFTI-1

isolated from sunflower seeds and its analogues. Biochem. Biophys. Res. Commun. 2002, 292, 855-

859.

11. Marx, U. C.; Korsinczky, M. L.; Schirra, H. J.; Jones, A.; Condie, B.; Otvos, L., Jr.; Craik, D. J.

Enzymatic cyclization of a potent Bowman-Birk protease inhibitor, sunflower trypsin inhibitor-1,

and solution structure of an acyclic precursor peptide. J. Biol. Chem. 2003, 278, 21782-21789.

12. Korsinczky, M. L.; Clark, R. J.; Craik, D. J. Disulfide bond mutagenesis and the structure and

function of the head-to-tail macrocyclic trypsin inhibitor SFTI-1. Biochemistry 2005, 44, 1145-1153.

13. de Veer, S. J.; Swedberg, J. E.; Akcan, M.; Rosengren, K. J.; Brattsand, M.; Craik, D. J.; Harris, J.

M. Engineered protease inhibitors based on sunflower trypsin inhibitor-1 (SFTI-1) provide insights

into the role of sequence and conformation in Laskowski mechanism inhibition. Biochem. J. 2015,

469, 243-253.

14. Daly, N. L.; Chen, Y. K.; Foley, F. M.; Bansal, P. S.; Bharathi, R.; Clark, R. J.; Sommerhoff, C. P.;

Craik, D. J. The absolute structural requirement for a proline in the P3'-position of Bowman-Birk

protease inhibitors is surmounted in the minimized SFTI-1 scaffold. J. Biol. Chem. 2006, 281, 23668-

23675.

15. Austin, J.; Kimura, R. H.; Woo, Y. H.; Camarero, J. A. In vivo biosynthesis of an Ala-scan library

based on the cyclic peptide SFTI-1. Amino Acids 2010, 38, 1313-1322.

16. de Veer, S. J.; Wang, C. K.; Harris, J. M.; Craik, D. J.; Swedberg, J. E. Improving the selectivity of

engineered protease inhibitors: optimizing the P2 prime residue using a versatile cyclic peptide

library. J. Med. Chem. 2015, 58, 8257-8268.

17. Birk, Y. Purification and some properties of a highly active inhibitor of trypsin and alpha-

chymotrypsin from soybeans. Biochim. Biophys. Acta 1961, 54, 378-381.

18. Bowman, D. E. Differentiation of soy bean antitryptic factors. Proc. Soc. Exp. Biol. Med. 1946, 63,

547-550.

- 27 -

19. Elliott, A. G.; Delay, C.; Liu, H. L.; Phua, Z. Y.; Rosengren, K. J.; Benfield, A. H.; Panero, J. L.;

Colgrave, M. L.; Jayasena, A. S.; Dunse, K. M.; Anderson, M. A.; Schilling, E. E.; Ortiz-Barrientos,

D.; Craik, D. J.; Mylne, J. S. Evolutionary origins of a bioactive peptide buried within preproalbumin.

Plant Cell 2014, 26, 981-995.

20. James, A. M.; Jayasena, A. S.; Zhang, J.; Berkowitz, O.; Secco, D.; Knott, G. J.; Whelan, J.; Bond,

C. S.; Mylne, J. S. Evidence for ancient origins of Bowman-Birk inhibitors from Selaginella

moellendorffii. Plant Cell 2017, 29, 461-473.

21. Schechter, I.; Berger, A. On the size of the active site in proteases. I. Papain. Biochem. Biophys. Res.

Commun. 1967, 27, 157-162.

22. Domingo, G. J.; Leatherbarrow, R. J.; Freeman, N.; Patel, S.; Weir, M. Synthesis of a mixture of

cyclic peptides based on the Bowman-Birk reactive site loop to screen for serine protease inhibitors.

Int. J. Pept. Protein Res. 1995, 46, 79-87.

23. Gariani, T.; McBride, J. D.; Leatherbarrow, R. J. The role of the P2' position of Bowman-Birk

proteinase inhibitor in the inhibition of trypsin. Studies on P2' variation in cyclic peptides

encompassing the reactive site loop. Biochim. Biophys. Acta 1999, 1431, 232-237.

24. McBride, J. D.; Brauer, A. B.; Nievo, M.; Leatherbarrow, R. J. The role of threonine in the P2

position of Bowman-Birk proteinase inhibitors: studies on P2 variation in cyclic peptides

encompassing the reactive site loop. J. Mol. Biol. 1998, 282, 447-458.

25. Brauer, A. B.; Leatherbarrow, R. J. The conserved P1' Ser of Bowman-Birk-type proteinase

inhibitors is not essential for the integrity of the reactive site loop. Biochem. Biophys. Res. Commun.

2003, 308, 300-305.

26. Brauer, A. B. E.; Domingo, G. J.; Cooke, R. M.; Matthews, S. J.; Leatherbarrow, R. J. A conserved

cis peptide bond is necessary for the activity of Bowman-Birk inhibitor protein. Biochemistry 2002,

41, 10608-10615.

27. Song, G.; Zhou, M.; Chen, W.; Chen, T.; Walker, B.; Shaw, C. HV-BBI‒a novel amphibian skin

Bowman-Birk-like trypsin inhibitor. Biochem. Biophys. Res. Commun. 2008, 372, 191-196.

- 28 -

28. Chen, X.; Riley, B. T.; de Veer, S. J.; Hoke, D. E.; Van Haeften, J.; Leahy, D.; Swedberg, J. E.;

Brattsand, M.; Hartfield, P. J.; Buckle, A. M.; Harris, J. M. Potent, multi-target serine protease

inhibition achieved by a simplified beta-sheet motif. PLoS One 2019, 14, e0210842.

29. Shariff, L.; Zhu, Y.; Cowper, B.; Di, W.; Macmillan, D. Sunflower trypsin inhibitor (SFTI-1)

analogues of synthetic and biological origin via N→S acyl transfer: potential inhibitors of human

Kallikrein-5 (KLK5). Tetrahedron 2014, 70, 7675-7680.

30. de Veer, S. J.; Li, C. Y.; Swedberg, J. E.; Schroeder, C. I.; Craik, D. J. Engineering potent

mesotrypsin inhibitors based on the plant-derived cyclic peptide, sunflower trypsin inhibitor-1. Eur.

J. Med. Chem. 2018, 155, 695-704.

31. Capaldi, S.; Perduca, M.; Faggion, B.; Carrizo, M. E.; Tava, A.; Ragona, L.; Monaco, H. L. Crystal

structure of the anticarcinogenic Bowman-Birk inhibitor from snail medic (Medicago scutellata)

seeds complexed with bovine trypsin. J. Struct. Biol. 2007, 158, 71-79.

32. Yuan, C.; Chen, L.; Meehan, E. J.; Daly, N.; Craik, D. J.; Huang, M.; Ngo, J. C. Structure of catalytic

domain of matriptase in complex with sunflower trypsin inhibitor-1. BMC Struct. Biol. 2011, 11, 30.

33. Bode, W.; Mayr, I.; Baumann, U.; Huber, R.; Stone, S. R.; Hofsteenge, J. The refined 1.9 Å crystal-

structure of human alpha-thrombin - interaction with D-Phe-Pro-Arg chloromethylketone and

significance of the Tyr-Pro-Pro-Trp insertion segment. EMBO J. 1989, 8, 3467-3475.

34. de Veer, S. J.; Swedberg, J. E.; Brattsand, M.; Clements, J. A.; Harris, J. M. Exploring the active site

binding specificity of kallikrein-related peptidase 5 (KLK5) guides the design of new peptide

substrates and inhibitors. Biol. Chem. 2016, 397, 1237-1249.

35. Zhu, Y.; Underwood, J.; Macmillan, D.; Shariff, L.; O'Shaughnessy, R.; Harper, J. I.; Pickard, C.;

Friedmann, P. S.; Healy, E.; Di, W. L. Persistent kallikrein 5 activation induces atopic dermatitis-

like skin architecture independent of PAR2 activity. J. Allergy Clin. Immunol. 2017, 140, 1310-1322

e1315.

- 29 -

36. Furio, L.; de Veer, S.; Jaillet, M.; Briot, A.; Robin, A.; Deraison, C.; Hovnanian, A. Transgenic

kallikrein 5 mice reproduce major cutaneous and systemic hallmarks of . J. Exp.

Med. 2014, 211, 499-513.

37. Furio, L.; Pampalakis, G.; Michael, I. P.; Nagy, A.; Sotiropoulou, G.; Hovnanian, A. KLK5

inactivation reverses cutaneous hallmarks of Netherton syndrome. PLoS Genet. 2015, 11, e1005389.

38. Dorn, J.; Magdolen, V.; Gkazepis, A.; Gerte, T.; Harlozinska, A.; Sedlaczek, P.; Diamandis, E. P.;

Schuster, T.; Harbeck, N.; Kiechle, M.; Schmitt, M. Circulating biomarker tissue kallikrein-related

peptidase KLK5 impacts ovarian cancer patients' survival. Ann. Oncol. 2011, 22, 1783-1790.

39. Jiang, R.; Shi, Z.; Johnson, J. J.; Liu, Y.; Stack, M. S. Kallikrein-5 promotes cleavage of desmoglein-

1 and loss of cell-cell cohesion in oral squamous cell carcinoma. J. Biol. Chem. 2011, 286, 9127-

9135.

40. Magnen, M.; Gueugnon, F.; Guillon, A.; Baranek, T.; Thibault, V. C.; Petit-Courty, A.; de Veer, S.

J.; Harris, J.; Humbles, A. A.; Si-Tahar, M.; Courty, Y. Kallikrein-related peptidase 5 contributes to

H3N2 influenza virus infection in human lungs. J. Virol. 2017, 91, e00421-00417.

41. Brännström, K.; Öhman, A.; von Pawel Rammingen, U.; Olofsson, A.; Brattsand, M.

Characterization of SPINK9, a KLK5-specific inhibitor expressed in palmo-plantar epidermis. Biol.

Chem. 2012, 393, 369-377.

42. Brattsand, M.; Stefansson, K.; Hubiche, T.; Nilsson, S. K.; Egelrud, T. SPINK9: a selective, skin-

specific Kazal-type serine protease inhibitor. J. Invest. Dermatol. 2009, 129, 1656-1665.

43. Wishart, D. S.; Bigam, C. G.; Holm, A.; Hodges, R. S.; Sykes, B. D. 1H, 13C and 15N random coil

NMR chemical shifts of the common amino acids. I. Investigations of nearest-neighbor effects. J.

Biomol. NMR 1995, 5, 67-81.

44. Güntert, P.; Buchner, L. Combined automated NOE assignment and structure calculation with

CYANA. J. Biomol. NMR 2015, 62, 453-471.

45. Nederveen, A. J.; Doreleijers, J. F.; Vranken, W.; Miller, Z.; Spronk, C. A.; Nabuurs, S. B.; Güntert,

P.; Livny, M.; Markley, J. L.; Nilges, M.; Ulrich, E. L.; Kaptein, R.; Bonvin, A. M. RECOORD: a

- 30 - recalculated coordinate database of 500+ proteins from the PDB using restraints from the

BioMagResBank. Proteins 2005, 59, 662-672.

46. Chen, V. B.; Arendall, W. B.; Headd, J. J.; Keedy, D. A.; Immormino, R. M.; Kapral, G. J.; Murray,

L. W.; Richardson, J. S.; Richardson, D. C. MolProbity: all-atom structure validation for

macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 2010, 66, 12-21.

47. Laskowski, M., Jr.; Kato, I. Protein inhibitors of proteinases. Annu. Rev. Biochem. 1980, 49, 593-

626.

48. Laskowski, M.; Qasim, M. A. What can the structures of enzyme-inhibitor complexes tell us about

the structures of enzyme substrate complexes? Biochim. Biophys. Acta 2000, 1477, 324-337.

49. Bode, W.; Huber, R. Ligand binding: proteinase‒protein inhibitor interactions. Curr. Opin. Struct.

Biol. 1991, 1, 45-52.

50. Swedberg, J. E.; Li, C. Y.; de Veer, S. J.; Wang, C. K.; Craik, D. J. Design of potent and selective

cathepsin G inhibitors based on the sunflower trypsin inhibitor-1 scaffold. J. Med. Chem. 2017, 60,

658-667.

51. Shen, Y.; Delaglio, F.; Cornilescu, G.; Bax, A. TALOS+: a hybrid method for predicting protein

backbone torsion angles from NMR chemical shifts. J. Biomol. NMR 2009, 44, 213-223.

52. Humphrey, W.; Dalke, A.; Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. 1996, 14,

33-38.

53. Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.;

Kale, L.; Schulten, K. Scalable molecular dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781-

1802.

54. MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer,

S.; Gao, J.; Guo, H.; Ha, S.; Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T.; Mattos, C.;

Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith,

J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. All-atom

- 31 - empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B 1998,

102, 3586-3616.

55. Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin,

T. E. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem.

2004, 25, 1605-1612.

56. McNicholas, S.; Potterton, E.; Wilson, K. S.; Noble, M. E. Presenting your structures: the CCP4mg

molecular-graphics software. Acta Crystallogr. D Biol. Crystallogr. 2011, 67, 386-394.

57. de Veer, S. J.; Furio, L.; Swedberg, J. E.; Munro, C. A.; Brattsand, M.; Clements, J. A.; Hovnanian,

A.; Harris, J. M. Selective substrates and inhibitors for kallikrein-related peptidase 7 (KLK7) shed

light on KLK proteolytic activity in the stratum corneum. J. Invest. Dermatol. 2017, 137, 430-439.

- 32 - TOC graphic

- 33 -