<<

SYNTHESIS AND SCREENING OF SUPPORT-BOUND COMBINATORIAL CYCLIC PEPTIDE AND FREE C-TERMINAL PEPTIDE LIBRARIES

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Sang Hoon Joo, M. S.

*****

The Ohio State University 2007

Dissertation Committee: Approved by Professor Dehua Pei, Advisor

Professor Ross E. Dalbey ______Professor Thomas J. Magliery Advisor Graduate Program in Chemistry

Abstract ABSTRACT

One-bead one-compound (OBOC) peptide libraries have been useful tools in the biomedical sciences. However, OBOC peptide libraries usually display the N-termini of peptides on the surface as conventional solid phase proceeds in the C to

N direction. While large combinatorial libraries of cyclic peptides can be synthesized by the split-and-pool synthesis method, the sequence determination has been a challenge.

Also, peptide libraries with free C-termini face the same problem as well as the difficulty of synthesis in the N to C direction. We report here the development of cyclic peptide libraries and C-terminal peptide libraries for high-throughput screening and sequencing.

TentaGel microbeads (90 μm) were spatially segregated into outer and inner layers; cyclic peptides were displayed on the bead surface, whereas the inner core of each bead contained the corresponding linear encoding peptide. After screening of the cyclic peptide library, the identity of hit peptides was determined by sequencing the linear encoding peptides using a partial Edman degradation/mass spectrometry method. Using the same spatial segregation approach peptides were synthesized in the conventional C to

N direction, with their C-termini attached to the support through an ester linkage on the bead surface but through an amide bond in the inner layer. The surface peptides were cyclized between N-terminal amine and a carboxyl group installed at a C-terminal linker sequence, while the internal peptides stayed in the linear form. Base hydrolysis of the ii ester linkage in the cyclic peptides exposed a free α-carboxyl group at the C-termini of

the peptides attached to the resin via the N-termini. An inverted peptide library

containing five random residues was synthesized and screened for binding to PDZ

domains. The identity of the binding peptides was determined from the encoding peptides. Consensus recognition motifs were identified for the PDZ domains and representative peptides were individually synthesized and confirmed for binding to their

cognate PDZ domains. These methods expanded the utility of OBOC peptide libraries by displaying peptides in different ways.

iii

Dedication

To My Father in Heaven

iv

Acknowledgments ACKNOWLEDGMENTS

I would like to thank my advisor, Dr. Dehua Pei, for his support and encouragement.

His dedication to the excellence in Science inspired me to continue this journey. I also thank Dr. Ming-Daw Tsai, my former advisor, for letting me work on diverse projects allowing me to experience the challenges and opportunities. In addition, I would like to thank the professors in the Biological Division, especially Drs. Ross Dalbey and Thomas

Magliery for their kind support and inspiration.

I would like to thank my labmates, former and current, Drs. Junguk Park, Yun Ling,

Jinge Grace Zhu, Qing Xiao, Anne-Sophie Wavreille, Bhaskar Gopishetty, and, Mss. Jing

Zhang, Yanyan Zhang, Pauline Tan, and, Messrs. Amit Thakkar, Mathieu Garaud, and particularly Tao Liu, for their stimulating discussions.

I am indebted to Dr. Junan Li for his mentoring, and I am grateful to Drs. Deborah

Parris for the help with baculovirus system, Michael Zhu and Charles Brooks for their collaborations and active discussions.

I cannot thank enough my parents for their love and support. Finally, my special thanks go to my wife Sook Kyung for her understanding and support.

v

Vita VITA

1972. 4. 18. Born - Seoul, South Korea

1995. 2. B.S. Pharmacy, Seoul National University, Seoul, South Korea

2000. 8. M.S. Pharmacy, Seoul National University, Seoul, South Korea

2001-2007 Graduate Teaching and Research Associate,

The Ohio State University

PUBLICATIONS

1. You-Chin Lin, Mitchell B. Diccianni, Youngjin Kim, Hsin-Hung Lin, Chien- Hsin Lee, Ruey-Jen Lin, Sang Hoon Joo, Junan Li, An-Suei Yang, Huan- Hsien Kuo, Ming-Daw Tsai, Alice L. Yu. Human p16γ, a novel transcriptional variant of p16INK4A, co-expresses with p16INK4A in cancer cells and inhibits cell cycle progression. Oncogene (2007) in press.

2. Sang Hoon Joo, Qing Xiao, Yun Ling, Bhaskar Gopishetty, and Dehua Pei. High-Throughput Sequence Determination of Cyclic Peptide Library Members by Partial Edman Degradation/Mass Spectrometry. Journal of the American Chemical Society (2006) 128(39), 13000-13009.

3. Jeong-In Oh, Kwang-Hoon Chun, Sang-Hoon Joo, You-Take Oh, and Seung- Ki Lee. Caspase-3-dependent protein kinase C delta activity is required for the progression of Ginsenoside-Rh2-induced apoptosis in SK-HEP-1 cells. Cancer Letter (2005) 230(2), 228-238.

vi

4. Junan Li, Peter Muscarella, Sang Hoon Joo, Thomas, J. Knobloch, W. Scott Melvin, Christopher, M. Weghorst, and Ming-Daw Tsai. Dissection of CDK4-Binding and Transactivation Activities of p34SEI-1 and Comparison between Functions of p34SEI-1 and p16INK4A.Biochemistry (2005) 44(40), 13246 – 13256.

5. Young-Mi Ham, Joon-Seok Choi, Kwang-Hoon Chun, Sang-Hoon Joo, and Seung-Ki Lee. The c-Jun N-terminal Kinase 1 Activity Is Differentially Regulated by Specific Mechanisms during Apoptosis. The Journal of Biological Chemistry (2003) 278(50), 50330-50337.

6. Junan Li, Sang Hoon Joo, and Ming-Daw Tsai. An NF-kappaB-specific inhibitor, IkappaBalpha, binds to and inhibits cyclin-dependent kinase 4. Biochemistry (2003) 42(46), 13476-13483.

FIELDS OF STUDY

Major Field: Chemistry

vii

TABLE OF CONTENTS

P a g e

Abstract...... ii Dedication...... iv Acknowledgments...... v Vita...... vi List of Tables ...... xi List of Figures...... xii List of Abbreviations ...... xiv

Chapter 1 Introduction ...... 1 1.1 Combinatorial Peptide Libraries...... 1 1.2 Cyclic Peptides...... 3 1.3 Biologically Active Cyclic Peptides ...... 4 1.3.1 Tyrocidine and S...... 4 1.3.2 Cyclosporin A ...... 9 1.3.3 RGD Peptide ...... 10 1.3.4 RNA Binding Cyclic Peptides ...... 11 1.4 Cyclic Peptides with Genetic Encoding...... 12 1.4.1 Phage Display for Cyclic Peptides...... 12 1.4.2 Intein-Mediated Cyclization ...... 13 1.4.3 mRNA Display Based Cyclic Peptides...... 15 1.5 Synthetic Cyclic Peptide Libraries...... 19 1.5.1 Iterative Deconvolution ...... 20 1.5.2 Tandem Mass Spectrometry ...... 23 1.6 Peptide Libraries with Free C-Termini...... 25 1.6.1 PDZ Domains...... 26 1.6.2 14-3-3 Proteins...... 27

viii

1.7 Biological Libraries for Studying the Specificities of PDZ Domains ...... 28 1.7.1 Phage Display ...... 28 1.7.2 lacI Repressor ...... 29 1.7.3 Two-Hybrid System...... 30 1.7.4 FRET Based Screening...... 31 1.8 Synthetic Libraries for Studying the Specificities of PDZ Domains...... 32 1.8.1 Solution Phase Screening Using Synthetic Peptide Library...... 32 1.8.2 Inverted Peptides on Solid Support...... 33 1.8.3 Protein Microarray...... 35

Chapter 2 Synthesis and Screening of Support-Bound Cyclic Peptide Libraries...... 37 2.1 Introduction...... 37 2.2 Results...... 38 2.2.1 Design Strategy and Synthesis of Cyclic Peptide Libraries...... 38 2.2.2 Cyclization Efficiency and Cyclic/Linear Peptide Ratio...... 42 2.2.3 Sequence Determination of Cyclic Peptides by PED/MS ...... 58 2.2.4 On-Bead Screening for Streptavidin Binding Ligands ...... 61 2.2.5 On-Bead Screening for Porcine α-Amylase Inhibitors...... 64 2.3 Discussion...... 68 2.4 Conclusion ...... 73 2.5 Experimental Sections ...... 76 2.5.1 Materials ...... 76 α 2.5.2 Synthesis of N -Fmoc-Glu(δ-N-hydroxysuccinimidyl)-O-CH2CH=CH2 77 2.5.3 Synthesis of Cyclic Peptide Libraries...... 77 2.5.4 Determination of Cyclization Efficiency and Molar Ratio of Cyclic/Linear Peptides...... 80 2.5.5 Peptide Sequencing by PED/MS...... 80 2.5.6 MALDI-TOF Analysis of the Peptide ...... 81 2.5.7 Library Screening for Streptavidin Binding ...... 82 2.5.8 Labeling of α-Amylase with Biotin and Texas Red...... 83 2.5.9 Library Screening for α-Amylase Binding...... 84 2.5.10 Synthesis of Individual Peptides Binding to Streptavidin or α-Amylase. 85 2.5.11 SA-AP Pull-Down Assay...... 85 2.5.12 α-Amylase Inhibition Assay with Starch as Substrate ...... 86

ix

Chapter 3 Synthesis and Screening of Resin-Bound Combinatorial Libraries with Free C- Termini: Determination of the Sequence Specificities of PDZ Domains ...... 88 3.1 Introduction...... 88 3.2 Results...... 90 3.2.1 Design Strategy and Synthesis of Free C-Terminal Peptide Library...... 90 3.2.2 Characterization of the Inverted Peptide Library...... 95 3.2.3 Peptide Sequence Determination by PED...... 99 3.2.4 Identification of Binding Ligands for PDZ Domains ...... 102 3.2.5 Binding Affinity between Selected Peptides and CIPP PDZ Domains .. 110 3.2.6 Database Search of Potential CIPP-Binding Proteins...... 113 3.3 Discussion...... 119 3.4 Conclusion ...... 120 3.5 Experimental Sections ...... 120 3.5.1 Materials ...... 120 3.5.2 Synthesis of Inverted Peptide Libraries ...... 121 3.5.3 Peptide Sequencing by PED/MS ...... 123 3.5.4 Recombinant DNA Constructs ...... 124 3.5.5 Purification and Biotinylation of PDZ domains ...... 125 3.5.6 Peptide Library Screening...... 126 3.5.7 Synthesis of Individual PDZ-Binding Peptides ...... 127 3.5.8 Determination of Dissociation Constants by SPR ...... 127

Bibliography ...... 129

x

List of Tables LIST OF TABLES

Table Page

2.1. Cyclization Efficiency and Cyclic/Linear Peptide Ratio for 50 Randomly Selected Beads...... 45

2.2. Comparison of the Ionization Efficiencies of Linear and Cyclic Peptides in MALDI MS...... 56

2.3. Success Rate for Sequencing Resin-Bound Cyclic Peptides by PED/MS...... 60

2.4. Cyclic Peptides Selected against Streptavidin...... 62

2.5. Selected Sequences from Library II against Biotin-Labeled α-Amylase...... 66

2.6. Selected Sequences against Fluorescence-Labeled α-Amylase...... 67

3.1. Internal/Surface Peptide Ratio for 24 Randomly Selected Beads ...... 98

3.2. Success Rate for Sequencing Resin-Bound Inverted Peptides by PED/MS...... 101

3.3. Selected Sequences for PDZ1 Domain from NHERF1 (67 total)...... 103

3.4. Peptide Sequences Selected against the PDZ2 Domain of CIPP (82 total)...... 105

3.5. Peptide Sequences Selected against the PDZ3 Domain of CIPP (80 total)...... 106

3.6. Peptide Sequences Selected against the PDZ4 Domain of CIPP (134 total)...... 107

3.7. Dissociation Constants (KD, μM) of Selected Peptides toward CIPP PDZ Domains...... 112

3.8. Potential CIPP-PDZ2 Binding Proteins from Database Search...... 114

3.9. Potential CIPP-PDZ3 Binding Proteins from Database Search...... 114

3.10. Potential CIPP-PDZ4 Binding Proteins from Database Search...... 116

xi

List of Figures LIST OF FIGURES

Figure Page

1.1. Resin-Bound Peptide Display...... 2

1.2. Structure of Gramicidin S and Tyrocidine A...... 5

1.3. Comparison of the Parallel Synthesis and Split-and-Pool Synthesis...... 8

1.4. Cyclo-RGD Peptide ...... 11

1.5. Intein Mediated Cyclization...... 14

1.6. mRNA Display Based Cyclic Peptides...... 18

1.7. Solid Phase Synthesis of Cyclic Peptides...... 19

1.8. Concept of Iterative Deconvolution...... 22

1.9. Collision Induced Dissociation Fragmentation Patterns for Linear Peptides and Cyclic Peptides...... 24

1.10. Side Reactions of N to C Peptide Synthesis ...... 26

1.11. lac Repressor C-Terminal Peptide Library...... 29

1.12. Peptide Inversion through Cyclization...... 33

2.1. Synthesis of Cyclic Peptide Library ...... 41

2.2. MALDI-TOF MS Spectra Showing the Peptide Cyclization on Bead...... 44

2.3. MALDI-TOF MS Spectra of 8 Cyclic Peptides Individually Synthesized on Bead. 49

2.4. HPLC and MS Analysis of Linear and Cyclic Peptide AVWmeFRRVQ...... 52

2.5. HPLC and MS Analysis of Linear and Cyclic Peptide AVWfFRRVQ...... 54

2.6. Partial Edman Degradation of Cyclic Peptide Library ...... 59 xii

2.7. Binding of SA-AP to Immobilized Peptide Cyclo(GTHPQALE)...... 63

2.8. Inhibition of α-Amylase by Cyclic Peptide Cyclo(AWfFYFKVE)NH2...... 65

2.9. Modified Cyclic Peptide Libraries for α-Amylase Inhibitors ...... 74

2.10. Design of Cyclic Peptide Library for Tyrocidine A Analogs...... 75

3.1. Synthesis of Spatially Segregated and Inverted Peptide Library...... 94

3.2. MALDI-TOF Mass Spectra of Library I Peptides at Different Stages of Synthesis . 97

3.3. Partial Edman Degradation of Free C-Termini Peptide Library...... 100

3.4. Sequence Specificity of NHERF1 PDZ1 Domain...... 103

3.5. Sequence Specificity of CIPP PDZ Domains ...... 108

3.6. SPR Analysis of the Binding Interaction between Immobilized PDZ Domains and Peptide Ac-YAAKHYYV-OH ...... 111

xiii

List of Abbreviations LIST OF ABBREVIATIONS

α alpha Abu α-Aminobutyric acid, or (S)-2-aminobutyric acid Ac acetyl β beta BCIP 5-Bromo-4-chloro-3-indolyl Phosphate BSA Bovine Serum Albumin t-Bu tert-butyl Bn benzyl Bz benzoyl °C degrees Celsius calcd calculated δ delta

ddH2O Double distilled water DMAP 4-(N,N-dimethylamino)pyridine DCM Dichloromethane DMF N,N-dimethylformamide DMSO dimethylsulfoxide eq. or equiv equivalent Et ethyl FACS Fluorescence activated cell sorter Fmoc 9-Fluorenylmethoxycarbonyl Fmoc-OSU N-(9-Fluorenylmethoxycarbonyloxy) succinimide

xiv γ gamma g gram(s) GST Glutathione S-Transferase h hour(s) HATU 2-(1H-7-Azabenzotriazol-1-yl)--1,1,3,3-tetramethyl-uranium- hexafluoro-phosphate Methanaminium HBTU O-Benzotriazole-N,N,N’,N’-tetramethyl-uronium-hexafluoro-phosphate HOAt 1-Hydroxy-7-azabenzotriazole HOBt N-hydrobenzotriazole HPLC High Pressure Liquid Chromatography i.d. inner diameter IPTG Isopropyl-β-D-thiogalactoside k kilo L liter(s) LB Luria-Bertani m milli μ micro M moles per liter MALDI-TOF Matrix Assisted Laser Desorption Ionization-Time of Flight Me methyl Min minute(s) Mol mole(s) MS mass spectrometry m/z mass to charge ratio (MS) NHS N-hydroxysuccinimidyl Nic-OSU Nicotinic Acid O-Succinimide, other name Nic-NHS Nle Norleucine NMM N-Methyl Morpholine, or 4-Methylmorpholine PCR Polymerase Chain Reaction PED Partial Edman Degradation PEG xv Ph phenyl PITC Phenylisothiocyanate SA-AP Streptavidin-Alkaline Phosphatase SDS-PAGE Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis SPPS Solid Phase Peptide Synthesis Ph phenyl rt room temperature t tertiary (tert) TFA Trifluoroacetic Acid THF tetrahydrofuran

Standard one-letter codes are used for deoxynucleotides, and standard one- or three-letter codes are used for amino acids.

xvi

CHAPTER 1

INTRODUCTION

1.1 Combinatorial Peptide Libraries

One-bead one-compound (OBOC) peptide library method was first introduced by

Lam and co-workers in 1991 [1], based on split-and-pool synthesis [1-3]. Each bead

prepared by the method displays multiple copies of the same peptide (~ 100 pmol of

peptide for a 90 μm bead), and it takes a relatively short period of time (1-2 weeks) to

prepare a random peptide library of tremendous diversity (e.g., 3.2 millions for 5 random

positions from 205). This OBOC combinatorial library has been successful for identifying ligands for various biological targets, such as streptavidin [1] and SH2 domains [4].

During the screening of the libraries for biological targets, peptides stay bound to the

solid support, and the ligand-target interactions can be visualized for the selection of

“hits” by various methods. Once the hits are selected, they are subjected to the identification. With a recent development in our laboratory, called partial Edman degradation (PED) [5-7], the peptide sequence of library members can be determined very efficiently at relatively low cost. Briefly, beads of unique peptide sequences are

1 subjected to multiple cycles of PED by treating with a mixture of degrading reagent

(PITC) and capping reagent (Fmoc-OSU, or Nic-OSU in a previous method). In each

cycle, a small portion of the peptide chain is capped whereas the majority is degraded. A

series of sequence-specific peptide fragments are generated from each resin-bound

peptide. The sequence of the peptide is then revealed from MALDI-TOF spectrum. This

method allowed us to determine the peptide sequences of individual library members

very efficiently. With the OBOC peptide libraries, our laboratory has been successful to

study the ligand binding specificities [4] and substrate specificities [8].

Nevertheless, there has been a limitation with OBOC peptide libraries due to the way the

peptides are synthesized on solid phase. The peptides on the resin support have the

orientation in which N-terminus of each peptide is exposed to the surface, while C-

terminus is covalently linked to the resin support. To overcome the limitation from the

way the peptides are displayed on the resin, we have developed a “cyclic peptide library”

and a “free C-terminal peptide library”.

a c OR2 H O R4 H O O H R4 H N N N N O 2 N N N R3 H H H NH R1 O R3 OR5 HN R5 O b 5 OR H R3 O H R1 NH O HO N N R2 HN N N N R H H H O 1 O R4 O 2 OR

Figure 1.1. Resin-Bound Peptide Display. (a) Conventional peptide synthesis displays N-terminus of peptide on the surface. (b) Free C-terminal peptide library (Chapter 3). (c) Cyclic peptide library (Chapter 2).

2 In the following paragraphs, the characteristics of cyclic peptides, some examples,

and different approaches to their development will be reviewed. After reviewing cyclic

peptides, the need for “free C-terminal peptide library” along with the biological targets

for the peptide library, and currently available methods will also be reviewed.

1.2 Cyclic Peptides

Cyclic peptides are polypeptide chains taking cyclic ring structures with the linkage

of each ends. For homodetic cyclic peptide, the ring consists solely of amino acids and amino terminal and carboxyl terminal are joined with a peptide bond; which makes it impossible to determine the end. In heterodetic cyclic peptides, cyclization may be mediated with various chemical bonds such as lactone, ether, thioether, and disulfide and so on.

Since tyrocidine and gramicidin S were proposed and confirmed to be cyclic peptides [9, 10], there have been numerous cyclic peptides and depsipeptides found in nature. Many naturally occurring cyclic peptides have diverse biological activities; gramicidin and tyrocidine show bactericidal activity, while cyclosporin A, a cycloundecapeptide from fungus, is clinically used as an immunosuppressant.

Cyclic peptides have more conformational rigidity compared to their linear counterparts [11, 12], and this rigidity allows enhanced receptor-binding affinities and specificities [13]. Moreover, these cyclic peptides are resistant to exopeptidases due to the lack of both amino and carboxyl termini. These features make cyclic peptides attractive leads for drug discovery and excellent molecular probes for biological research. 3 For this reason there have been diverse approaches to find biologically active cyclic

peptides or improve the activities of natural cyclic peptides using combinatorial

chemistry.

1.3 Biologically Active Cyclic Peptides

In the following paragraphs, some of cyclic peptides with biological activities will be reviewed, along with the applications of combinatorial chemistry to improve their activities.

1.3.1 Tyrocidine and Gramicidin S

Tyrocidine and gramicidin S are both cyclodecapeptides with antibacterial activity.

Gramicidin, on the other hand, is a mixture of linear pentadecapeptides with

antibacterial activity. Tyrocidine was found from a culture extract of a soil bacillus,

Bacillus brevis, as bactericidal agent as early as 1939 along with gramicidin [14, 15].

Initial chemical characterization revealed that it is a peptide lacking free amino termini,

and it was proposed to have cyclic structure where amino terminus and carboxyl

terminus are linked with amide bond [9]. Later, the mixture of gramicidin and

tyrocidine was named as tyrothricin [16], and it became the first commercialized

and it is in clinical use even today. Gramicidin S, (Soviet gramicidin) was

discovered on the other side of the Earth in early 1940’s and has a similar structure to

tyrocidine [10].

4 NH2

O O

H H N N N N H H N O O O O O O N H H N N N N H H

O O

H2N

NH2

O O HO H H N N N 8 N 7 H 10 9 H 1 HN O O O O O O O 2 N 6 4 H 3 H 5 N N N N H H2N H

O O O

H2N

Figure 1.2. Structure of Gramicidin S and Tyrocidine A.

Upper: Gramicidin S, cyclo(Val-Orn-Leu-D-Phe-Pro)2, Lower: Tyrocidine A, cyclo(Val-Orn-Leu-D-Phe-Pro-Phe-Phe-Asn-Gln-Tyr).

5 As shown in Figure 1.2, gramicidin S is composed of two identical pentapeptides,

written as cyclo(Val-Orn-Leu-D-Phe-Pro)2. The structural model shows two β-chains

linked by proline residues and four intramolecular hydrogen bonds; the model was

proposed by Hodgkin and Oughton in 1950’s [17], and then confirmed by X-ray

crystallography twenty years later [18]. The existence of intramolecular hydrogen

bonds would add more rigidity to the innately rigid structure of this cyclic peptide and

this is a good example of the chemical characteristic of cyclic peptides. Tyrocidine has

several isotypes and the chemical structure of tyrocidine A is shown in Figure 1.1. The

part of the sequence (Val-Orn-Leu-D-Phe-Pro) is identical to gramicidin S, and the

structural feature of tyrocidine is similar too [19]. These cyclic peptides have

amphipathic character, to which their antibacterial activity is attributed [20]. There are

discrete hydrophobic face and basic (cationic) face on the and this particular

feature is believed to help disrupt the bacterial membrane [21, 22]. According to one

of the models, the cationic face first interacts with negatively charged lipid head group,

then the hydrophobic face makes a contact with lipid portion of the membrane

resulting in the rupture of membrane [23]. While tyrothricin was the first

commercialized antibiotic, its use is limited to topical application due to the low

selectivity; tyrocidine A disrupts not only bacterial cell membrane but also mammalian

cell membrane. Nonetheless, they are still an attractive target for antibiotic

development as there is a less chance of antibiotic resistance development. It would be hardly possible for bacteria to develop resistance to membranolytic effect, which

would require significant change in membrane structure [24].

6 Until recently, the modification or improvement of the cyclic peptides such as

tyrocidine A and gramicidin S involved individual which is time

consuming to extract the structure and activity relationship. For example, as of 1996, half

a century after the discovery of gramicidin S, only about 200 gramicidin S analogs

(decapeptides) have been synthesized [25], which is negligibly a small portion of the

theoretical diversity (over 1013 from 2010) for cyclic decapeptides consisting of proteinogenic, standard amino acids. More recently, Walsh and co-workers employed an

enzyme domain from non-ribosomal peptide synthetase for cyclization of precursor

synthesized on solid phase [26]. They fed the terminal thioesterase domain the peptide

precursor bound to the resin support and the enzyme could catalyze the cyclization of

precursor peptide, releasing the cyclic peptide tyrocidine A (Figure 1.2). By replacing

amino acids at position 1 and 4 (D-Phe1 and D-Phe4) with different amino acids, natural

and unnatural (8 and 24 each), they synthesized 192 different tyrocidine A analogs and

optimized the selectivity. They obtained one tyrocidine A analog with almost the same

antibacterial activity but reduced hemolytic effect.

Another remarkable approach was the one taken by Guo and co-workers [27, 28], in

which they utilized the propensity of tyrocidine A precursor to form a conformation

highly favorable for head-to-tail cyclization [29]. They prepared tyrocidine A precursor

peptides with amino acids at positions 3, 4, 5, and 6 replaced by different amino acids (4,

2, 4, and 6 different amino acids each position). The parallel synthesis allowed fairly

large number of diversity (4 x 2 x 4 x 6 = 192), and they could obtain more potent and

less toxic compound.

7 Even though those two approaches certainly yielded improved the tyrocidine analogs, there remain some limitations. As mentioned above, the diversity they could reach was less than 200, due to the limitation of sequential or parallel synthesis. Comparison of the parallel synthesis and split-and-pool synthesis is shown in Figure 1.3. For tyrocidine analog, the tendency of precursor peptides toward a conformation favoring head to tail cyclization helped the preparation of cyclic peptides. This held true when they replaced part of sequences, but it is not clear how much modification would be amenable.

Therefore, these two methods may not be applicable to generally preparing cyclic peptides.

a

split

split

b

split

pool

split

Figure 1.3. Comparison of the Parallel Synthesis and Split-and-Pool Synthesis. a) Parallel synthesis. b) Split-and-pool synthesis. 8 1.3.2 Cyclosporin A

Cyclosporin A ([R-[R*,R*-(E)]]-cyclic(L-alanyl-D-alanyl-N-methyl-L-leucyl-N- methyl-L-leucyl-N-methyl-L-valyl-3-hydroxy-N,4-dimethyl-L-2-amino-6-octenoyl-L-α- aminobutyryl-N-methylglycyl-N-methyl-L-leucyl-L-valyl-N-methyl-L-leucyl) is a fungal cyclic peptide isolated from Tolypocladium inflatum Grams. It has an immunosuppressive activity and is clinically used worldwide to prevent graft rejection with annual sale of US$ 1.0 billion [30]. The of this compound was first reported in 1984 by Wenger [31]. There are many features drawing attention. The cyclic structure and the existence of several Nα-methylated amino acids would be a challenge to prepare this compound synthetically. It crosses the cell membrane freely, possibly a feature from the cyclic peptide structure. As it has a high toxicity, there is a need for modification and improvement of the compound. But there has been few combinatorial approach found for this. One interesting approach by Aramburu and co-workers used peptide library to mimic one of functions of cyclosporin A [32]. Rather than modifying cyclosporin A structure, they focused on the function of cyclosporin A that inhibits phosphatase activity of calcineurin. Therefore they built a peptide inhibitor mimicking the docking site on the calcineurin substrate, which has no structural similarity to cyclosporin A. The lack of combinatorial approaches for cyclosporin A could be attributed to the technical challenge dealing with cyclic peptides and unnatural amino acids.

9 1.3.3 RGD Peptide

Integrins are a large family of receptor proteins and they are important for interaction of cells with extracellular matrix. Many integrins recognize the tripeptide sequence Arg-

Gly-Asp for the interaction. In late 1980’s, it was noticed that peptides containing RGD could inhibit tumor cell growth [33, 34]. These peptides were initially linear peptides, and then there followed several studies reporting cyclic peptides containing RGD motif by

Kessler and co workers [35, 36]. Compared to linear peptides, cyclic peptides were more active, as expected. Ruoslahti and co-workers used phage display library to find the peptides that bind α5β1 integrin and they could get the sequence GAC*RGDC*LGA (* disulfide linkage forming cyclic peptide) [37]. They did not intend to select cyclic peptides and found that cyclic peptide was the most potent. In 1994, Cheresh group reported that the injection of cyclo-RGD peptide (EMD 66203, Cyclo-RGDfV, f is D-Phe,

Figure 1.4) could promote tumor regression by inducing apoptosis of angiogenic blood vessels [38]. This compound was further investigated and the improved version was reported in 1999 [39], with clinical trials followed [40].

Looking at the development of this compound, each of candidate was synthesized based on NMR structure and molecular dynamics study, and the throughput for each study was usually less than ten molecules. When RGD peptide sequence is fixed, there are 400 and 8,000 diversities for cyclic penta- and hexapeptides respectively, when only 20 natural amino acids are used. Even though it may not be necessary to try all those possible combinations, it clearly illustrates the technical difficulty when cyclic peptides are studied for biological targets.

10 HN NH2 NH O HN NH O O O HN NH HO N O H O

Figure 1.4. Cyclo-RGD Peptide (EMD 66203, Cyclo-RGDfV).

1.3.4 RNA Binding Cyclic Peptides

Varani and co-workers developed BIV2 and related molecules that bind to transactivator response element (TAR) RNA of bovine immunodeficiency virus (BIV)

[41, 42]. BIV2 works as an inhibitor of Tat-TAR interaction, which is essential for the viral replication. They developed this compound by coupling a D-Pro-L-Pro dipeptide to the peptide sequence from Tat protein, mimicking β-hairpin structure found in the Tat-

BIV TAR complex. From the initial library of 8 different cyclic peptides (BIV1 through

BIV8), BIV2 was the best with the KD value ~150 nM. After BIV2 was developed, they studied individual residual contribution by changing each residue with Ala. The parallel synthesis of different compounds, not more than 100 compounds, yielded a compound with nanomolar affinity. BIV2 shows the applicability of cyclic peptides targeting RNA molecule, and again highlights the technical difficulty in developing cyclic peptides. 11 1.4 Cyclic Peptides with Genetic Encoding

In previous paragraphs, some of the combinatorial approaches have been introduced.

Several combinatorial approaches have been already discussed above. Here, the

advantages and disadvantages of cyclic peptide libraries utilizing genetic encoding will

be discussed.

1.4.1 Phage Display for Cyclic Peptides

Phage display technology, first introduced by Smith in 1985 [43], allows display of

peptide library on the phage surface, which can be selected for binding toward a molecule of interest. This technique takes advantage of the ability of phage to tolerate the addition of foreign peptides onto its coat proteins. Usually, the foreign peptides are displayed on the N-terminus (right after signal peptide sequences), middle, or C-terminus of different coat proteins, such as protein-3 or protein-8, depending on the design of library. The peptide sequence from each phage particle is directly related to the DNA sequence of the same phage particle, which could be later on identified easily. Unlike peptide library, phage display can be repeated (called bio-panning) as long as encoding DNA molecules are preserved, and this feature allows the enrichment of best binders. One more strength for this technology lies in the diversity it can achieve, typically up to 109. As described

above, cyclic peptides could be obtained from phage display as seen in the example of

RGD peptide [37]. That was probably the first report where a disulfide linked cyclic

peptide library was selected from phage display without planning to screen cyclic peptides. In 1992, one year earlier to this report, O’Neil and co-workers applied phage

display to screen CX6C sequences for platelet glycoprotein, IIb/IIIa [44]. The phage

12 particles are released to oxygen rich periplasmic space of bacteria, and naturally two neighboring Cys residues would form a disulfide bond to give a cyclic peptide. There are many examples where cyclic peptides from phage display show affinity to receptor molecules. There are several reviews of phage display such as the one by Smith and

Petrenko [45].

There are drawbacks of phage display for cyclic peptides. First, cyclic peptides formation rely on disulfide formation. When head-to-tail cyclization is desired, this method cannot be used. Second, phage display is limited to natural, ribosomal amino acids. Many cyclic peptides have non-ribosomal amino acids, which are not accessible with phage display technique.

1.4.2 Intein-Mediated Cyclization

Inteins (Internal Proteins, or Intervening protein sequences) refer to the protein

sequences that are spliced out during maturation. In a precursor protein, the intein is a part of the precursor, called fused intein. After maturation, the intein is spliced out of the

precursor protein, called free intein, and the protein sequences flanking the intein, called

exteins, are ligated (Figure 1.5a). During maturation, the folding of the fused intein

brings together the splice junctions and the residues that catalyze the process [46, 47].

Scott and Benkovic reported Split-intein circuit ligation of peptides and proteins

(SICLOPPS), utilizing a trans-intein DnaE to prepare cyclic peptide in 1999 [48].

13 a precursor protein Extein-N Fused Intein Extein-C

intein folding

mature protein + free intein b

HS O

HZ SH

I I N

O O N NH A5

I I C H C O N A4 A2 A4 HZ I IN O C N A3 A5 N H H H N A3 N H2N O O A H N H 2 2 O O O

S O O O O O O

A5 A5 NH2 A5 S A5 I

I NH N

Z Z N

HN A A I A 2 A N

I

4 4 4 I 4 C

I C C HZ O O HZ A A H A H A 3 3 N 3 N 3 NH NH N A2 A2 A2 H A2 O O H2N O H2N O O O

Figure 1.5. Intein Mediated Cyclization. (a) General intein protein maturation process. (b) Cyclic peptide cyclo(XA2A3A4A5) formation utilizing split intein. X: Cys or Ser; Z: S or O; IC: intein-C fragment; IN: intein-N fragment.

14 As seen Figure 1.5b, intein-C fragment ending with an Asn residue is fused with a target peptide sequence beginning with Cys or Ser, followed by intein-N fragment beginning with a Cys residue. During maturation, N-to-S acyl shift occurs at the Cys residue on intein-N fragment forming a thioester. The thioester undergoes transesterification with a side-chain nucleophile of the first residue (Cys or Ser) on the target peptide. This leads to the formation of a lariat intermediate, and the Asn side chain from intein-C fragment liberates the cyclic product as a lactone, which will rearrange into the thermodynamically more stable lactam product. By constructing a SICLOPPS vector that allows the insertion of target peptide in the fusion protein, cyclic peptide libraries could be prepared in vivo, where each clone expresses a unique cyclic peptide. They could use this method for the genetic selection of inhibitors of protein-protein interactions

[49, 50].

SICLOPPS certainly is a valuable tool for the development of cyclic peptides. The diversity it can provide (typically ~108), and the fact that cyclic peptides can be prepared in vivo make this technology an attractive tool. But, cyclic peptide synthesis in vivo can also be a limitation, as it prevents versatile in vitro screening. In addition, the method always requires Cys (or Ser) residue in the sequences, and there are limited choice of amino acids as it relies on ribosomal protein synthesis. As many of biologically active cyclic peptides have non-ribosomal amino acids, this would be a drawback of this method.

1.4.3 mRNA Display Based Cyclic Peptides

mRNA display is an in vitro method of displaying peptides/proteins covalently linked to the encoding mRNA. As shown Figure 1.6, a DNA library is prepared first and

15 the corresponding mRNA is produced for each member of the library. The mRNA is

then coupled to the DNA tail with puromycin at the 3’ end. The RNA/DNA hybrid

molecule is then used as a template for in vitro protein synthesis, usually using the rabbit reticulocyte lysates. As the mRNA template lacks a stop codon, the ribosome stalls at the

end of translation, and the puromycin on the DNA tail gets incorporated at the C-

terminus of the peptide/protein. For each mRNA molecule is linked to the

peptide/protein it encodes, it is possible to repeat the bio-panning as seen in phage display and biological library methods. While this method was developed initially to prepare linear polypeptide libraries by Szostak and Roberts [51], Roberts and co-workers

recently came up with a method to prepare cyclic peptides based on mRNA display [52].

They utilized a chemical linker, disuccinimidyl glutarate (DSG) to couple the N-terminal

end of peptides with the side chain of Lys residue at the C-terminal region of the peptides.

They recently reported a cyclic peptide that targets the signaling molecule Gαi1 with

high affinity [53]. They claimed that Nα-methylphenylalanine (NMF) was used as one of

the building blocks for in vitro translation with the use of nonsense suppression.

However, they did not show any cyclic peptides, among those selected from screening, that actually have this Nα-methylated amino acid. This implies at least two possibilities.

One is that simply NMF was not favored for the binding of peptides toward their target molecule, and the other is that the steric hindrance prevented efficient coupling of NMF

during in vitro translation. One drawbacks of mRNA display is that there is only one peptide produced from each mRNA template. If the poor efficiency of amino acid coupling results in the failure of peptide coupling, the mRNA encoding the corresponding peptide would be lost during the bio-panning regardless of actual binding affinity of the 16 construct. Another limitation of this method is that this method cannot provide homodetic cyclic peptides which are found in nature, as they used a chemical linker to provide a cyclic structure. In addition, when there are multiple Lys residues in the sequence, the cyclization reaction would not happen selectively to form a desired cyclic peptide. Also the dimer formation between two mRNA-peptide hybrid molecules mediated by crosslinker cannot be excluded.

17 a T7 TMV ATG-random nucleotides (no stop codon) T7 RNA pol 5’ 3’ mRNA DNA template Puromycin-DNA linker added to 3’ end

PCR amplification P mRNA DNA linker-puromycin in vitro translation Met aa aa Met aa aa aa aa P aa aa aa aa Selected fusion molecule P A 5’ P

Ribosome stall Affinity selection Puromycin incorporation Reverse Met Met aa Transcription aa aa aa aa aa aa aa P aa aa P 5’ RT cDNA b mRNA linked peptide

H3C S CH 2 O CH2 A2 A4 A6 A8 A10 HN CH C An+2-An+3~Am-Puromycin- H2NCHC A3 A5 A7 A9 CH2 O O CH2

CH2 O O CH2 mRNA linked precursor O O NH2 N N OO O O A A disuccinimidyl glutarate A5 6 7 A A4 8 A9 A3 A10 A2 O O C NH An+2-An+3~Am-Puromycin- S HN O O

NH mRNA linked cyclic peptide O

Figure 1.6. mRNA Display Based Cyclic Peptides. a) General scheme of mRNA display. P inside oval: puromycin. b) Preparation of a cyclic peptide using mRNA display and chemical linker disuccinmidyl glutarate.

18 1.5 Synthetic Cyclic Peptide Libraries

While biosyntheses of natural cyclic peptides have not been fully understood, these processes involve non-ribosomal peptide syntheses, allowing the incorporation of non- ribosomal amino acids (e.g., D-amino acids and Nα-methylated amino acids) and non-

amino acid moieties. Occurrence of many non-ribosomal amino acids in natural cyclic peptides makes above biological libraries less attractive for the combinatorial library preparation. As the incorporation of non-natural amino acids and non-amino acids is possible, development of cyclic peptide library with synthetic approach has been pursued in seek of biologically active compounds.

Synthetic cyclic peptides can be prepared by several strategies. The ring closure between N-terminus and C-terminus of peptide chain forming lactam allows head-to-tail homodetic cyclic peptides. In other ways, either N-terminus or C-terminus can be linked with side chain, and also side chain-side chain linkage would allow heterodetic cyclic peptide formation. In many cases, solid phase peptide synthesis appears to be the method of choice due to the advantage of pseudodilution phenomenon, which favors intramolecular cyclization reactions [54].

H R O H O N R O N Fmoc O H X N N NH2 O H n R' N O X H O O n R' O OH

Figure 1.7. Solid Phase Synthesis of Cyclic Peptides.

19 Solid phase synthesis of cyclic peptides uses different orthogonal protecting groups

as the method reported by Tanner and co-workers in 1993 [55]. In Figure 1.7, the peptide

on the bead support has Fmoc-protecting group at the N-terminal, and allyl-protecting

group at the C-terminal of Glu residue which is linked to the resin with its side chain.

Removal of those two protecting groups with any other protecting groups on the peptide

remaining would allow a selective coupling between free N-terminal amine and C-

terminal carboxyl group forming a cyclic peptide bound on the bead. Depending on X

group, the peptide can either remain on the bead surface or get released from the bead.

Large combinatorial libraries of cyclic peptides can routinely be prepared this way, with

split-and-pool method. However, unlike those biological library methods in previous

paragraphs, sequence determination of hit peptides has been problematic. Either phage

display, SCILOPP, or mRNA display has encoding DNA/RNA sequences that would

reveal the identity of hits. Sequence determination of synthetic cyclic peptide library

members has been challenging due to the fact that they no longer have N-terminal free

amine, which is required for conventional Edman degradation. In the following paragraphs, two approaches to determine the identity of hits in cyclic peptide libraries will be discussed. Individual synthesis and parallel synthesis of cyclic peptides were discussed above and they will not be discussed again, as the technical difficulty has been

highlighted already.

1.5.1 Iterative Deconvolution

Iterative deconvolution is based on repeated processes of synthesis of soluble

compound pools followed by activity screening for the preparation of smaller,

20 sublibraries. As shown in Figure 1.8, first set of library is tested and then sublibraries are prepared subsequently for the identification of hit. If the library was constructed in n steps (e.g., 5 steps for pentapeptide), n sets of synthesis and screening are necessary to find the best library member(s). This method has been successfully used for many biological targets since it was introduced by Houghten and co-workers in 1991 [3]. As this method does not require the sequence determination of each library member, it could be used for cyclic peptide libraries. One example of this method for cyclic peptides is the study by Chu and co-workers in 1998 [56]. They built a soluble library of cyclo(AXXXXXAE)K-CONH2, and screened for streptavidin binding. After repeated screening, they could identify cyclo(AHPQFXAE)K-CONH2 as tight-binding ligands against streptavidin. This process required 5 sets of synthesis and screening. This time consuming process would be one of the drawbacks of this method. Furthermore, this method depends on the cumulative activity of mixtures in the pool. The fact that one pool showed the best activity does not warrant that the best compound is in the pool. It is possible that one pool with lots of low activity compounds and little amount of high activity compounds would show lower activity than the other pool only with average activity compounds. In that case, the best compound will never be found from the screening process.

21

First library: AXXXX BXXXX CXXXX DXXXX best activity: AXXXX Second library: AAXXX ABXXX ACXXX ADXXX best activity: ABXXX Third library: ABAXX ABBXX ABCXX ABDXX best activity: ABDXX Fourth library: ABDAX ABDBX ABDCX ABDDX best activity: ABDCX Fifth library: ABDCA ABDCB ABDCC ABDCCD best compound: ABDCB

Figure 1.8. Concept of Iterative Deconvolution.

22 1.5.2 Tandem Mass Spectrometry

Tandem mass spectrometry, or MS/MS combines two or more mass analyzers for the

sample analysis. In MS/MS, the ion generated by the first MS is further fragmented in a

reaction chamber by several methods such as collision induced dissociation (CID), and

the resulting fragment ions are analyzed by the second MS. MS/MS has been a very

useful tool especially for the peptide sequencing [57]. While it has been routinely used

for the sequence determination of linear peptides, application to cyclic peptides has been

challenging. Even though several studies reporting the use of MS/MS for cyclic peptides

[58-62], there seems no practical use of MS/MS for cyclic peptide libraries yet. The most

successful MS/MS method for a cyclic peptides library is the one reported by the Ghadiri

group [62], in which the accuracy achieved was only ~ 77%. Furthermore, they used the

peptides derived from macro-beads which yields typically ~ 100 nmol peptides per bead

while the more popular micro-beads has only ~ 0.1 – 1 nmol loading per bead, which is hundredfold less. The challenge of MS/MS analyzing cyclic peptides comes from the structural feature of cyclic peptides. During MS/MS analysis of cyclic peptides, the parent molecule gives multiple linear peptides with same number of amino acids as the

ring opening occurs. This makes the analysis more complex than the one for linear

peptides. Figure 1.9 illustrates the difference when analyzing a hypothetical cyclic

pentapeptide and its corresponding linear peptide. As the ring size becomes larger, it

would be more difficult to analyze fragment ions from MS.

23 a A-B-C-D-E A-B-C-D A-B-C A-B A B B-C B C B-C-D B-C B C C-D C D B-C-D-E B-C-D B-C B C C-D C D C-D-E C-D C D D-E D E A-B-C-D, B-C-D-E, A-B-C, B-C-D, C-D-E, A-B, B-C, C-D, D-E, A, B, C, D, E (total 15 fragments) b A-B-C-D-E A-B-C-D-E, B-C-D-E-A, C-D-E-A-B, D-E-A-B-C, E-A-B-C-D A B-C-D-E-A A-B-C-D, B-C-D-E, C-D-E-A, D-A-B-C, E-A-B-C B E C-D-E-A-B A-B-C, B-C-D, C-D-E, D-E-A, E-A-B C D-E-A-B-C A-B, B-C, C-D, D-E, E-A D A, B, C, D, E (total 25 fragments) E-A-B-C-D

Figure 1.9. Collision Induced Dissociation Fragmentation Patterns for Linear Peptides and Cyclic Peptides. (a) Possible fragmentations for a hypothetical pentapeptide A-B-C-D-E yielding 15 different fragments from the mother molecule. (b) Fragmentations for a cyclic peptide cyclo(A-B-C-D-E) yielding 25 different fragments.

24 1.6 Peptide Libraries with Free C-Termini

There are diverse applications of combinatorial peptide libraries. They have been

applied to identify specific binding ligands against receptors [4, 63-68], define the

substrate specificity of [8, 69-74], and develop new catalysts [75-80]. However,

there are proteins/enzymes that recognize the C-terminus of another peptide/protein, and

these remain challenging targets for on-bead peptide library screening. In a typical resin-

bound combinatorial peptide library, N-termini of peptides are displayed on the bead

surface while C-termini are linked to the bead support, as conventional solid phase

peptide synthesis proceeds in the C to N direction. While there have been a few

approaches to synthesize in the reverse, in the N to C direction [81], usually they suffered low yields coming from side reactions, and racemization problems [82] (Figure 1.10).

Furthermore, the identification of hits would be another challenge. Since the peptides no longer have free N-termini on the surface, conventional Edman degradation is not possible.

In the following paragraphs, PDZ domains and 14-3-3 proteins that recognize C- termini for interaction will be introduced. In addition to these two proteins, there are many enzymes that catalyze the C-terminal modification of proteins, including C- terminal specific proteases [83, 84] and C-terminal lipidation enzymes [85, 86]. All of the proteins would be possible targets for C-terminal peptide libraries.

25 O R O R H H O N N H NH O O NH O R' Activating group(Act) R'

O R O R H H N N H NH O H N O O R' H Act R' H* O

H H R O N H N O NH2 Act R O R H O

Figure 1.10. Side Reactions of N to C Peptide Synthesis. Upper: Diketopiperazine formation; Center: Oxazolone formation; Lower: C to N peptide synthesis. modified from [81].

1.6.1 PDZ Domains

Protein-protein interactions are key events in signal transduction and other functional

organization within cells. Many of these protein-protein interactions are mediated by different protein domains such as well known SH2 domains [87], phosphotyrosine- binding (PTB) domains [88], PDZ domains as described below, and many others. PDZ

(Postsynaptic density-95/Disc large, Dlg/Zona occluden-1) domains are a large family of protein domains, widely found in organisms from bacteria to man. There are more than

400 PDZ proteins in human [89, 90]. In eukaryotic cells, PDZ domain containing proteins are found usually in cytoplasm and they function as adaptors that hold receptors and other signaling molecules forming large molecule complexes. PDZ domains usually comprise ~ 90 amino acids consisting of six β strands and two α-helices. Many of the

26 solved structures of PDZ domains complexed with their binding ligands, or domains alone illustrated the interaction mechanism [91, 92]. There is a carboxylate-binding loop between the βA and βB strands and it appears the most of the molecular recognition for binding is through the four C-terminal residues of PDZ-interacting proteins. There are several classes of PDZ domains according to their C-terminal sequence specificities. One class of PDZ domains, to which PSD-95 belongs, shows C-terminal sequence specificity as X-Ser/Thr-X-Val/Leu; either Ser or Thr are found at -2 position when C-terminal residue is defined as position 0, and C-terminal residue is either Val or Leu. Each PDZ domain recognizes a specific subset of peptide sequences, and we can understand its cellular function by defining the sequence specificity and searching for the physiological binding partner proteins. There have been several methods to study the sequence specificities of PDZ domains and they will be discussed later in this chapter.

1.6.2 14-3-3 Proteins

14-3-3 proteins are ubiquitous proteins found in many species including vertebrates, plants, and yeasts. Initially they were found as an acidic, abundant protein in brain in

1967 [93], and later it became clear that 14-3-3 proteins are ubiquitously expressed in many tissues [94]. Seven different isotypes of 14-3-3 protein (β, ε, γ, η, σ, τ, ζ) are known to exist in human and they have several roles in regulating signal transduction pathways like cell cycle checkpoint, MAP kinase activation, apoptosis and gene expressions. Their physiological functions are mediated by their interaction with other proteins, and there are more than 100 proteins known to interact with them. 14-3-3 proteins exist as dimeric proteins, each monomer comprising 9 α-helices (~30 kDa). The

27 interface formed by the interaction of monomers provides a channel to which peptide and protein ligands bind [95]. 14-3-3 proteins recognize phosphoserine/threonine for the binding. Yaffe and co-workers used a soluble phosphopeptide library to screen for the

binding to 14-3-3 proteins and revealed RSXpSXP and RXY/FXpSXP sequences as the binding motifs [96]. In addition to these two recognition motifs, 14-3-3 proteins bind

their partner proteins through phosphorylated C-terminal sequences [97], pS/pT (X1–2)-

COOH; this binding is referred as mode III [98]. There have been limited studies to

define the mode III sequence specificity of 14-3-3 proteins. Li and co-workers utilized a genetic screening to find peptide motifs promoting surface localization, to find SWTY sequence motif that interacts with 14-3-3 proteins [99].

1.7 Biological Libraries for Studying the Specificities of PDZ Domains

There are several methods for the study of the sequence specificities of PDZ domains

based on biological libraries. These libraries display peptides/proteins in different ways

using DNA templates.

1.7.1 Phage Display

As mentioned in the paragraph 1.4.1, peptides fused to the coat proteins can be

displayed on the surface of phage particles. Most popular way of displaying peptides are

through the fusion to either protein-8 or protein-3. Initially, all the phage-displayed peptide libraries were fused at the N-termini of these coat proteins; displaying peptides at

the C-terminal of coat proteins was thought impossible. Even though Jespers and co-

workers introduced C-terminal fusion protein of coat protein-6 in 1995 [100], it has yet to be verified for the application to PDZ domains. In 2000, Fuh and co-workers developed 28 a way to fuse peptide libraries at the C-terminus of protein-8 [101]. In their approach, a small number of fusion proteins were displayed along with the majority of wild-type protein-8, supplied by a helper phage. They demonstrated the validity of the method by

analyzing the sequence specificities of two PDZ-domains, and there are several reports utilizing this method (e.g., [102]). Phage display is a good tool to study PDZ domains.

However, as mentioned earlier, it is restricted to 20 proteinogenic amino acids and is not

compatible with any post-translational modification. This limitation is again found from

lacI repressor method as described in the next paragraph.

1.7.2 lac I Repressor

Lac repressor with random peptide (tetramer)

lacI random DNA lacI random DNA

lacO lacO

lacO lacO

Affinity screening

Figure 1.11. lac Repressor C-Terminal Peptide Library.

Schatz and co-workers developed a C-terminal display strategy utilizing lacI repressor protein bound to plasmid DNA [103]. The so-called, “peptides-on plasmids library” relies on extremely high affinity of lac repressor protein to the lacO sites on the plasmid. In their method, the library plasmid contains the lacI gene with random coding 29 sequences fused at the 3’ end. Inside bacterial cell, the lacI-peptide fusion protein is produced and then it makes a complex with the same plasmid that encodes the fusion protein by binding to the lacO site (Figure 1.11). The purified plasmid-protein complex

can be used for affinity screening. As there is a linkage between the phenotype and the

genotype, the identity of C-terminal sequence can be determined by DNA sequencing of

the plasmid. In addition, by transforming E. coli again with those selected plasmids,

affinity screening can be repeated for several rounds. Li and co-workers utilized this

method to study the sequence specificity of the PDZ domain of nitric oxide synthase

[104], and the NHERF PDZ domains [105].

1.7.3 Two-Hybrid System

Two-hybrid system is a method of identifying protein-protein interactions, first

introduced by Fields and Song in 1989 [106]. This method relies on the independence of

two functional domains of DNA transcription factors. For example, the yeast protein

GAL4 has a DNA-binding domain and a transcriptional activation domain. While each

domain is functionally active even when it is not physically linked to the other, the actual transcriptional activation occurs only when both domains are physically close. There are two hybrid proteins in the two-hybrid system, one protein fused to the DNA-binding domain, the other protein to the transcriptional activation domain. The binding can be detected once the reporter gene is turned on. Until protein microarray was introduced, this method was the only high-throughput screening method for protein-protein interaction, and even now it is a powerful tool. Initial two-hybrid system was based on yeast GAL4, performed in yeast, and a high-throughput screening can be achieved

30 rapidly and inexpensively with this method. The study of interaction for a target protein

(called bait) against a library of proteins (called prey) available from cDNA libraries can be performed to find binding partners for the protein.

Kurschner and co-workers used a peptide containing an artificial PDZ target C- terminus, as bait in a yeast two-hybrid screening to identify a PDZ domain protein, CIPP

[107]. This approach shows the power and the limitations of this method. While two- hybrid system is a high-throughput method, it is high-throughput in terms of preys, not the bait. From the approach they used, one-peptide one-bait method, it is impossible to cover all possible peptide sequences. In addition, they used CIPP (a PDZ protein) as bait to find targets for CIPP. While two-hybrid system clearly gave the list of candidates as possible binding partners for PDZ domains, they can provide some sequence information from C-termini of those proteins, not all possible peptide sequences for PDZ domain binding.

1.7.4 FRET Based Screening

Fluorescence resonance energy transfer (FRET) is the interaction of two dye molecules in which the energy from one molecule (donor) is passed to the other molecule

(acceptor) without emission of a photon. The efficiency of this interaction is reciprocally proportional to the sixth power of intermolecular distance ( ∝ 1/r6) [108], making it very useful for the study of the interactions of biological molecules. Daugherty and co- workers recently developed a FRET method for screening protein-protein interaction in the cytoplasm of E. coli [109]. They utilized a pair of fluorescent proteins YPet and

CyPet, fused with a PDZ domain from PSD-95 protein and 15-mer peptide libraries (2 x

31 106 members) respectively. Proteins were expressed inside bacterial cells, and cells exhibiting FRET could be detected and sorted by FACS. Those selected cells were grown and then sorted again to enrich the population of cells with PDZ binding peptides.

The results from the screening correlated well with the known sequences for the PDZ

domain binding. The FRET hybrids allowed direct quantitative and near-real-time

detection of protein-protein interactions. Even more, they could rank affinity during the

screening process. The limitation of the method, like other biological libraries, is the

restriction of building blocks to proteinogenic amino acids.

1.8 Synthetic Libraries for Studying the Specificities of PDZ Domains

Several biological libraries have been reviewed in the previous paragraphs. Along

with those biological libraries, there are several approaches using synthetic peptides for studying PDZ specificities.

1.8.1 Solution Phase Screening Using Synthetic Peptide Library

Cantley and co-workers reported a study of PDZ domain specificity in 1997 [110],

utilizing the use of oriented peptide libraries [111]. A soluble mixture of peptides with

the sequence KNXXXXXXXX-COOH was synthesized and passed over a column

containing a PDZ domain of interest. Any peptides that have affinity toward the PDZ domain would be retained on the column, and those not having affinity would be washed out of column. The retained peptides were subjected to Edman degradation, as a mixture,

to reveal the amino acid preference at each position. As a mixture of peptides is used for

Edman degradation, individual sequences are not revealed; only the preference of specific

amino acids at each position could be obtained from their study. This study truly 32 provided valuable information of the PDZ domain specificity. Nonetheless, there is a

limitation of this method. This method cannot identify the individual sequences, from the

way the pool of peptides are sequenced. This therefore is unable to reveal any sequence covariance. Furthermore, if there are few peptides with strong affinity, they would be overlooked due to abundant peptides with medium affinity.

1.8.2 Inverted Peptides on Solid Support

O HO O O NH NH O NH A5 A4 A O H 3 O A2 H N N N A N H 1 A5 A3 A1 O O H O H2N A4 A2 O O O O O O NH A2 A4 NH HO A1 A3 A5

O H N N H HO O

Figure 1.12. Peptide Inversion through Cyclization. modified from [112].

While the peptide synthesis in the reverse direction is challenging, it is possible to

synthesize peptides in a conventional way, and then invert the orientation using an

appropriate linker. As shown in Figure 1.12, peptides can be first cyclized while attached

to an ester linker, and then hydrolysis of the ester would invert peptides, exposing free C-

termini. This “reversal of peptide orientation via cyclization/cleavage” first appeared in

1994 by Kania and coworkers [112]. Davies and Bradley reported the synthesis of

peptide inversion, and used an 1,000-member OBOC inverted peptide library with 3

33 random positions to screen against dansyl-labeled tweezer receptor [113, 114]. To identify the hits, they incorporated 20% of encoding peptides which was mixed on the bead surface with the inverted peptides. After screening, individual bead hits were picked, and the inverted peptides were released from the resin; the remaining encoding peptides on the resin were subjected to Edman degradation for sequencing. One problem of this method is that the inverted peptides are mixed with the encoding peptides on the bead surface that may interfere with library screening.

Volkmer-Engert and coworkers utilized the peptide inversion method for the PDZ domain specificity study [115-117]. Instead of building a peptide library by split-and- pool synthesis, they used SPOT synthesis. Inverted peptides corresponding to the C- terminal sequences of 6223 human proteins were synthesized on cellulose membrane and screened against Erbin PDZ domain. In SPOT synthesis, the identity of peptide on each

spot is spatially determined. Therefore there is no need for encoding tag as Davies and

Bradley’s method. With the use of an automated system, thousands of unique peptides

could be synthesized on the cellulose membrane and they could used to find possible

binding targets for the PDZ domains. One of the drawbacks of this method is the

limitation of diversity. The diversity from SPOT synthesis is usually not more than

10,000. While the study could find binding target candidates, it does not provide a

complete sequence specificity profile. In addition, it requires a sophisticated automated

machine to build a peptide library, which is not available in every research lab. There are

several reviews about SPOT synthesis as the one by Frank in 2002 [118].

34 1.8.3 Protein Microarray

Microarray, as the name implies, display series of molecules on the solid surface for

analysis. In 1991, Fodor and co-workers from Affymax Research Institute reported a

parallel synthesis of 1024 peptides on the glass surface [119]. Each peptide could be

synthesized on a 50 μm square, using the combination of solid phase peptide synthesis, a

photo-labile protection group, and photolithography. With 50 μm square, 40,000

different compounds can be synthesized in 1 cm2. Compared to SPOT synthesis

(typically ~20 spots per cm2), microarray contains tremendous number of compounds on

the same surface area. This was the first peptide-microarray, and it became more popularized after DNA chip was introduced, again by Fodor and co-workers [120]. The

molecule on each square of the solid surface can be prepared in several ways. The first

way would be the synthesis on the surface as Fodor and co-workers did with the peptide

microarray. The second way would be spotting on each square that is functionalized for covalent coupling or non-covalent binding on the surface. For example, aldehyde functional group was used for protein microarray to couple proteins covalently through

Lys residues by MacBeath and Schreiber [121].

Microarray is usually combined with fluorescence detection that allows the researchers to measure the interaction of microarray members with the molecule of interest. Depending on the way how the microarray is designed, either increase or decrease of fluorescence can be observed upon binding of target molecule to the hits on a microarray. One example by MacBeath and Schreiber is as below.

35 Protein microarray, as expected from the name, arrays proteins on the solid surface,

usually glass. In 2000, MacBeath and Schreiber introduced the first protein microarray

[121]. The density of spots they reported was 576 spots/cm2 (10,800 spots in 2.5 x 7.5

cm2). One of the applications they proposed was protein-protein interaction. For instance,

protein G on the chip was probed with BODIPY-FL-conjugated IgG (an antibody tagged with a fluorescence dye) and then chip was scanned with a fluorescence slide scanner.

Only the spots containing protein G were visible, as expected. In addition, they demonstrated the application in enzyme-substrate recognition of kinases, and protein- small molecule interactions.

Protein-microarray has been used to determine the specificities of PDZ domains by

MacBeath and co-workers [122, 123]. They utilized the strength of protein-microarray by spotting all of the soluble PDZ domains from mouse (157 total) on the chip. Then they measured affinity of fluorescence labeled peptides derived from C-terminal sequences of mouse proteins (217 total). The throughput provided by the protein- microarray, and the fidelity from the fluorescence based affinity measurement are strengths of this assay. Nevertheless, this method cannot give the sequence specificity profile due to the limited number of peptides tested.

36

CHAPTER 2

SYNTHESIS AND SCREENING OF SUPPORT-BOUND CYCLIC PEPTIDE LIBRARIES *

2.1 Introduction

Cyclic peptides and depsipeptides, as explained in the previous chapter, have

enhanced receptor-binding affinity, specificity, and stability relative to their linear

counterparts due to their conformational rigidity. In addition to the examples of cyclic

peptides introduced in the earlier chapter, there have been numerous biologically active

cyclic peptides and depsipeptides found in nature [124, 125], and rational design and

screening of combinatorial libraries, synthetic or biological, added even more to the

arsenal of cyclic peptides. Nonetheless, currently available methods for cyclic peptides

have several limitations. As there are many nonproteinogenic amino acids found in

biologically active cyclic peptides, biological libraries may not be suitable in many cases

not to mention other limitations mentioned earlier. While synthetic libraries would allow

the incorporation of nonproteinogenic amino acids, many of the synthetic approaches

* Reproduced partially with permission from Joo, S.H.; Xiao, Q.; Ling, Y.; Gopishetty, B.; Pei, D. Journal of the American Chemical Society, 128, 13000-13009, Copyright 2006, American Chemical Society.

37 report the synthesis and screening of no more than 200 cyclic peptides at a time [26-28], not being able to utilize the full capacity of solid phase split-and-pool synthesis. The limitation is caused from the difficulty of determining the sequences of cyclic peptides.

Many synthetic approaches simply avoid this difficulty by parallel synthesis [26-28], iterative deconvolution [56, 126], or synthesizing small number of compounds.

Sequencing of cyclic peptides by tandem mass spectrometry has not reached the stage where it can be used practically yet [58-62].

We report here a development of resin-bound cyclic peptide libraries that allows on- bead screening of millions of cyclic peptide library members.

2.2 Results

2.2.1 Design Strategy and Synthesis of Cyclic Peptide Libraries

Our strategy is to topologically segregate a resin bead into two different layers; the bead surface displays a cyclic peptide to be screened against a biological target(s), whereas the inner core carries the corresponding linear peptide as the encoding sequence, which can be readily determined by partial Edman degradation/mass spectrometry (PED/MS). To test this strategy, a cyclic octapeptide library containing four random residues was synthesized on TentaGel resin (90 μm, ~100 pmol peptide/bead) by cyclization between the N-terminus and the α-carboxyl group of a C-terminal glutamate (Figure 2.1).

Eighteen proteinogenic amino acids (excluding Met and Cys), Nle (replacing Met), and

Abu (replacing Cys) were incorporated at each residue; the theoretical diversity of the library is 160,000 from 204. A linker sequence, BBRM (B = β-alanine), was added to the

38 C-terminus to facilitate CNBr cleavage and MS analysis [127]. During coupling of the C-

terminal glutamate to the resin, the beads were segregated into outer and inner layers

using a technique pioneered by Lam [128]. Briefly, beads bearing the BBRM linker were

soaked in water, drained, and quickly suspended in 55:45 (v/v) dichloromethane/diethyl

ether containing 0.5 equivalent of a side chain N-hydroxysuccinimidyl (NHS) ester of L-

α glutamic acid, N -Fmoc-Glu(δ-NHS)-O-CH2CH=CH2. Because the organic solvent is

immiscible with water, only peptides on the bead surface were exposed to and reacted with the activated ester. The beads were washed with DMF and the remaining free N- terminal amines in the inner core (0.5 equivalent) were acylated with Fmoc-Glu(tBu)-OH.

Following the addition of an arbitrary dipeptide Ala-Leu, the random region was synthesized by the split-and-pool method [1-3] to give a “one-bead one-sequence” library.

A glycine was added to the N-terminus to facilitate peptide cyclization. Finally, the N- terminal Fmoc group and the α-allyl group on the C-terminal glutamate were removed by piperidine and Pd(PPh3)4, respectively. Subsequent treatment with benzotriazole-1-yl-

oxy-tris-pyrrolidino-phosphonium hexafluorophosphate (PyBOP) cyclized the surface

peptides, while the peptides in the bead interior were kept in the linear form.

After we successfully synthesized the first cyclic peptide library

cyclo(GXXXXALE), called library I, we synthesized the second cyclic peptide library

cyclo(AXXXXXVE), called library II, with 5 random residues of 26 amino acids

(theoretical diversity = ~ 12 millions from 265). We incorporated at each random residue

with 14 nonproteinogenic amino acids including 6 D-amino acids [Ala, Glu, Phe, Leu,

Asn, Val], 4 Nα-methylated amino acids [sarcosine (Sar), L-Nα-methylalanine (Mal or

meA), L-Nα-methylleucine (Mle or meL), L-Nα-methylphenylalanine (Mpa or meF)], L- 39 4-fluorophenylalanine (Fpa or fF), ornithine (Orn), L-phenylglycine (Phg), Nle (replacing

Met as above), as well as 12 proteinogenic amino acids [Asp, Gly, His, Ile, Lys, Pro, Gln,

Arg, Ser, Thr, Trp, Tyr]. The nonproteinogenic amino acids, many of which are found frequently in naturally occurring non-ribosome synthesized peptides [124, 125], were

included to increase the structural diversity of the library.

40 Figure 2.1. Synthesis of Cyclic Peptide Library O O NH-BBRM NH-BBRM O O O O NH-BBRM O O NH L NH N-GXXXXALEBBRM 2 A G Fmoc-NH-GXXXXAL-NH H X X Fmoc-NH-GXXXXAL-NH X X i ; d) excess Fmoc-Glu(tBu)-OH, HBTU; e) 2 O O Cl 2 NH-BBRM NH-BBRM O ; g) piperidine; ; g) HOBt; and h) PyBOP, i) TFA. 4 O O ) 3 NH-BBRM NH-BBRM O O O/CH 2 O O Fmoc-HN O O NH Fmoc-HN L NH A G de X X X X N-GXXXXAL-NH 2 H h O O N-BBRM 2 HN-BBRM H O -NHS)-O-CH2CH=CH2 in Et Fmoc-HN δ OH O b, c NH-BBRM NH-BBRM O O O O -Fmoc-Glu( α N Synthesis of cyclic peptide library. Reagents: a) standard Fmoc/HBTU chemistry; b) soak in water; soak in water; b) Fmoc/HBTU standard chemistry; Reagents: a) peptide cyclic library. of Synthesis N-GXXXXAL-NH N-BBRM N-BBRM 2 2 2 H H H N-GXXXXAL-NH 2 H a split-and-pool synthesis Fmoc/HBTU by chemistry; f) Pd(Ph c) 0.5 equiv. Figure 2. 1. f, g f, N N 2 2 H H

41

2.2.2 Cyclization Efficiency and Cyclic/Linear Peptide Ratio

To analyze the yield of peptide cyclization on the resin, a small aliquot of the above resin

during library I synthesis (~10 mg, after cyclization but prior to side chain deprotection) was treated with excess benzylamine (BnNH2) and PyBOP. The resulting resin was

treated with reagent K to remove side-chain protecting groups and 50 beads were randomly selected for MS analysis. The peptide on each bead was released by CNBr cleavage and analyzed by matrix-assisted laser desorption ionization-time of flight

(MALDI-TOF) MS. If cyclization of surface peptides was not complete, the remaining free α-carboxyl group on the C-terminal Glu would react with BnNH2 to give an m/z M +

107 peak in the MS spectrum (where M is the pseudo-molecular ion of the cyclic peptide).

In addition, the linear encoding peptide in the bead interior should produce a peak at m/z

M + 18 position. Among the 50 beads analyzed, 7 showed M + 107 peaks and their abundance was ≤5% (relative to the M peaks) in six of the cases (Table 2.1). Figure 2.2a shows the MS spectrum of a typical bead, on which the cyclization was complete (no M

+ 107 peak). Figure 2.2b shows one of the 7 MS spectra which had visible M + 107 peaks

(5% relative abundance). In a control experiment, an aliquot of the resin before cyclization was treated with excess BnNH2/PyBOP. The MS spectra of the resulting

beads all showed intense M + 107 peaks, which dominated the corresponding M peaks

(cyclic peptides) in most cases (see Figure 2.2c for an example). This indicates that the

reaction between the α-carboxyl group and BnNH2 was efficient under the experimental

conditions. The formation of peptide dimer and/or oligomers was also examined. Out of

42 the 50 beads, dimer formation was observed for 8 beads; in all cases the abundance was

12% or less (relative to the M peaks) (Table 2.1). No oligomer formation was detected on

any of the 50 beads. The yield of cyclization and the molar ratio of cyclic/linear peptides

on each bead were calculated from the relative abundance of the peaks, by assuming

equal ionization efficiency for cyclic, linear, benzylated, and dimeric peptides (which all have the same amino acid sequence and contain a C-terminal arginine for efficient ionization). With the exception of two beads (K1 and K24), the cyclization yield was typically ≥90% (Table 2.1). The percentage of cyclic peptides on beads varied greatly, from 0.3% (K17) to 99.4% (M19) (not counting bead K1). However, the calculated average value (%cyclic) for the 50 beads was 49%.

The above cyclization yields were surprisingly high (Table 2.1), since peptide cyclization had often been described as challenging and of low yields by previous investigators [129-

131]. In addition, the cyclization yields were based on the signal intensities in MALDI mass spectrometry, which is not commonly used for quantitative analysis. We therefore carried out additional experiments to confirm the observed cyclization yields. First, the percentage of cyclic peptides was determined by ninhydrin tests of the free amine contents for a small amount of the resin before and after cyclization (1.0 mg each). This

assay showed that ~50% of the peptides were cyclized (data not shown), in agreement

with the calculated value (49%).

43 a M (cyclic) 1176.5 1194.5 M + 18 (linear)

M + 107 (linear + BnNH2)

1100 1200 1300 1400 m/z b

M M + 18 1185.4

1203.5 M + 107 1292.5

1100 1200 1300 1400 m/z c

M + 18 M + 107 1222.5 1311.5

M 1204.5

1100 1200 1300 1400 m/z

Figure 2.2. MALDI-TOF MS Spectra Showing the Peptide Cyclization on Bead. TentaGel resin was treated with BnNH2 and PyBOP after or before peptide cyclization. (a) The spectrum of a typical bead, with complete cyclization. (b) A small amount of benzylated product observed. (c) BnNH2, when added prior to cyclization, reacted faster than peptide cyclization.

44

M M+18 Are Are %cyclic Cyclization Bead Area Area M+107 Dimer (cyclic) (linear) a a peptide yield (%)

J3 1130.2 4183 1148.2 4754 ND ND 46.8 100.0 J4 1292.3 9894 1310.4 12724 ND ND 43.7 100.0 K1 ND 1349.9 9394 ND ND 0.0 0.0 K2 1205.2 1141 1223.2 5019 ND ND 18.5 100.0 K3 1222.2 1774 1240.2 8285 ND ND 17.6 100.0 K4 1345.2 2053 1363.2 10075 ND ND 16.9 100.0 K5 1262.4 371 1280.4 5979 ND ND 5.8 100.0 K6 1180.4 2419 1198.5 4407 1287.5 69 ND 35.1 97.2 K7 1293.5 5081 1311.6 5079 ND ND 50.0 100.0 K8 1185.5 9988 1203.6 8070 1292.6 533 2388.2 654 51.9 89.4 K9 1260.6 5431 1278.6 8870 ND 2538.5 702 36.2 88.6 K10 1180.4 7893 1198.4 8981 ND 2378.1 338 45.9 95.9 K11 1235.7 9163 1253.7 4111 ND ND 69.0 100.0 K12 1183.6 4775 1201.6 2582 ND 2384.3 336 62.1 93.4 K13 1177.7 7923 1195.8 8583 ND 2372.6 206 47.4 97.5 K14 1278.6 1478 1296.6 4821 ND ND 23.5 100.0 K15 1158.7 652 1176.7 8848 ND ND 6.9 100.0 K16 1176.6 5780 1194.6 6023 ND ND 49.0 100.0 K17 1239.5 33 1257.6 11951 ND ND 0.3 100.0 K18 1238.8 14887 1256.9 18708 ND ND 44.3 100.0 K19 1241.7 2483 1259.7 487 ND ND 83.6 100.0 K20 1238.8 4238 1256.8 2640 1345.9 30 ND 61.3 99.3 K21 1270.6 15965 1288.5 20582 ND ND 43.7 100.0 K22 1255.6 2415 1273.6 6850 ND ND 26.1 100.0 K23 1164.6 5588 1182.6 3487 1271.7 321 ND 59.5 94.6 K24 1168.4 244 1186.4 4563 1275.4 137 ND 4.9 64.0 (continued)

Table 2.1. Cyclization Efficiency and Cyclic/Linear Peptide Ratio for 50 Randomly Selected Beads ND, Not detected by MALDI-TOF (threshold = signal/noise >3). Percentage of cyclic peptide (%) was determined with formula area(M) / [area(M) + area(M+18) + area(dimer)] x 100. Cyclization yield (%) was determined with formula area(M) / [area(M) + area(M+107) + area(dimer)] x 100.

45 Table 2.1. (continued),

M M+18 Are Are %cyclic Cyclization Bead Area Area M+107 Dimer (cyclic) (linear) a a peptide yield (%)

M1 1313.0 10975 1331.1 4171 ND ND 72.5 100.0 M2 1262.3 8274 1280.3 4930 ND ND 62.7 100.0 M3 1238.3 1415 1256.3 6573 ND ND 17.7 100.0 M4 1251.3 5631 1269.3 1081 ND ND 83.9 100.0 M5 1322.2 11205 1340.2 309 ND ND 97.3 100.0 M6 1165.4 7606 1183.4 2267 ND ND 77.0 100.0 M7 1246.4 9284 1264.5 3525 ND ND 72.5 100.0 M8 1159.4 5923 1177.4 1148 ND 2336.1 60 83.1 99.0 M9 1245.4 4020 1263.4 8551 ND ND 32.0 100.0 M10 1278.4 2734 1296.4 2548 ND ND 51.8 100.0 M11 1280.5 8380 1298.5 4996 ND ND 62.6 100.0 M12 1212.6 533 1230.7 4437 ND ND 10.7 100.0 M13 1279.6 2402 1297.6 2118 ND ND 53.1 100.0 M14 1230.5 8249 1248.6 5791 ND 2478.3 91 58.4 98.9 M15 1149.6 5593 1167.6 7063 ND ND 44.2 100.0 M16 1200.6 3607 1218.6 5622 1307.7 98 ND 38.7 97.4 M17 1179.6 9623 1197.6 1410 ND ND 87.2 100.0 M18 1205.7 8445 1223.7 3529 ND ND 70.5 100.0 M19 1237.7 6410 1255.7 38 ND ND 99.4 100.0 M20 1226.6 214 1244.6 5045 ND ND 4.1 100.0 M21 1220.8 2442 1238.8 311 ND 2458.9 152 84.1 94.1 M22 1207.7 5724 1225.7 1930 ND ND 74.8 100.0 M23 1174.5 4065 1192.6 633 1281.5 21 ND 86.1 99.5 M24 1211.3 2904 1229.3 842 ND ND 77.5 100.0 Average 49.0 96.2

46 Second, eight cyclic peptides that had been selected from two different peptide

libraries for binding to streptavidin and porcine α-amylase (vide infra) were individually

synthesized on Rink LS amide resin (0.2 mmol/g) in the absence of encoding linear peptides. The crude products (after deprotection by reagent K) were analyzed by

MALDI-TOF MS. For each of the eight peptides, the desired cyclic peptide was by far

the predominant species in the spectrum (Figure 2.3). No uncyclized peptide (M + 18

peak) was observed in any of the spectra. Peptide dimer formation was observed for four

of the peptides and represents the principal impurities (20-30% abundance relative to the

cyclic monomer product). Third, two of the above peptides, cyclo(AVWmeFRRVQ) and

cyclo(AVWfFRRVQ), and their ~1:1 (mol/mol) mixtures with their respective linear

counterparts were analyzed by HPLC and MALDI-TOF MS (Figures 2. 4 and 2. 5 and

Table 2.2). In each case, the crude sample contained the desired cyclic peptide as the

major product (84% and 54% purity, respectively, based on HPLC analysis). The relative

ionization efficiency of cyclic vs. linear peptides varied with peptide concentration in the

MALDI samples. With ~12 pmol of peptides (linear and cyclic) spotted in each sample,

ionization ratios of 1.1 and 0.89 (cyclic/linear) were observed for the above peptides,

respectively (Table 2.2). At ~6 pmol peptides, the respective ratios were 0.50 and 0.25.

During our typical analyses of single beads by MALDI MS (Table 2.1), ~7 pmol of

peptide was used in each sample. Taken together, the above results indicate that high

peptide cyclization yields were achieved during the cyclic peptide library synthesis and

the observed variation in the cyclic/linear peptide ratio on different beads was not

primarily due to variation in peptide cyclization efficiency. Rather, the difference in

aqueous/organic phase partitioning during the bead segregation process was likely a

47 contributing factor; a smaller bead is expected to have a larger fraction of its volume exposed to the organic solvent and therefore a higher percentage of cyclic peptides.

Fluctuation in ionization efficiency during MALDI MS was another contributing factor, which was most likely responsible for the “observed” extremely low (0.3% in K17) and high cyclic peptide percentages (99.4% in M19).

48

Figure 2.3. MALDI-TOF MS Spectra of 8 Cyclic Peptides Individually Synthesized on Bead. Crude peptides synthesized on rink amide resin were dissolved in DMSO and diluted in 0.01% TFA in water. One μL of the peptide solution was mixed with 2 μL of saturated 4- hydroxy-α-cyanocinnamic acid in acetonitrile/0.1% TFA (1:1) and 1 μL of the mixture was spotted onto a MALDI sample plate. The peptide sequence, the calculated mass for pseudomolecular ion (M), and the observed m/z values are indicated in each spectrum. Nle, norleucine; fF, p-fluorophenylalanine; meF, Nα-methylphenylalanine; F, D- phenylalanine.

49

cyclo(GTHPQALE)BK Calcd M = 1032.56 Found m/z 1032.0

Dimer 2063.0

M + Na 1054.0

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(GWYHNALE)BK Calcd M = 1169.57 Found m/z 1169.2

M + Na 1191.1

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(GRFHHALQ) Calcd M = 947.50 Found m/z 947.13

Dimer 1894.3

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(GHNleHRALQ) Calcd M = 913.51 Found m/z 913.19

900 1100 1300 1500 1700 1900 2100 m/z (continued) Figure 2. 3. MALDI-TOF MS Spectra of 8 Cyclic Peptides Individually Synthesized on Bead.

50 Figure 2.3 (continued),

cyclo(GPRHYALQ) Calcd M = 923.48 Found m/z 923.25

Dimer 1845.6

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(AWfFYFKVQ) Calcd M = 1088.54 Found m/z 1088.63

M + Na 1110.6

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(AVWmeFRRVQ) Calcd M = 1057.61 Found m/z 1057.60

900 1100 1300 1500 1700 1900 2100 m/z

cyclo(AVWfFRRVQ) Calcd M = 1061.58 Found m/z 1061.52

900 1100 1300 1500 1700 1900 2100 m/z

51

Figure 2.4. HPLC and MS Analysis of Linear and Cyclic Peptide AVWmeFRRVQ.

(a) HPLC chromatogram of crude linear peptide. The desired product had a tR of 14.3 min. The impurities at tR = 19.6 and 21.0 min were peptides containing incompletely deprotected arginine side chains. (b) HPLC chromatogram of crude cyclo(AVWmeFRRVQ). (c) HPLC chromatogram of 1.33:1 mixture of cyclic/linear peptides. (d) MALDI-TOF MS of the peptide mixture from (c).

HPLC condition: Analytical HPLC was run on C18 column (Vydac 218TP54; 4.6 mm i.d. x 250 mm length) with a gradient formed from A: 0.05% TFA in water, B: 0.05 % TFA in acetonitrile using UV detection (280 nm) by Waters 486 tunable absorbance detector. The crude peptides were dissolved in DMSO to the concentration of 10 mg/mL and diluted in HPLC solvent (80% A). Each injection contained 200 μL sample (3% DMSO) and the column was eluted with a gradient of 20% to 80% B in 30 min at 1 mL/min.

52

Figure 2. 4. = 1.33 m/z

1075.8 1057.8 Molar ratio (from HPLC peak area) peak HPLC (from ratio Molar 1057.2) (m/z cyclic Area 1075.2) (m/z linear Area Cyclic Linear

d

% B % % B % B % 100.00 80.00 60.00 40.00 20.00 0.00 100.00 80.00 60.00 40.00 20.00 0.00 100.00 80.00 60.00 40.00 20.00 0.00 35.00 35.00 35.00 30.00 30.00 30.00 25.00 25.00 25.00 20.00 20.00 20.00 m/z + 1327.6 Pbf) (M Min Min Min 15.00 15.00 15.00 m/z 1057.2 m/z 10.00 10.00 10.00 m/z 1075.2 m/z 5.00 5.00 5.00 b peptide cyclic c cyclic + linear a peptide linear Calcd M = 1057.61 M Calcd Calcd M = 1075.62 M Calcd 0.00 0.00 0.00

0.18 0.15 0.12 0.09 0.06 0.03 0.00 0.12 0.09 0.06 0.03 0.00 0.18 0.15 0.12 0.09 0.06 0.03 0.00

280 nm 280 280 nm 280 nm 280

53

Figure 2.5. HPLC and MS Analysis of Linear and Cyclic Peptide AVWfFRRVQ.

(a) HPLC chromatogram of crude linear peptide. The desired product had a tR of 13.8 min. The impurity at tR = 21.9 min was the peptide containing one incompletely deprotected arginine side chain. (b) HPLC chromatogram of crude cyclo(AVWfFRRVQ). (c) HPLC chromatogram of 0.97:1 mixture of cyclic/linear peptides. (d) MALDI-TOF MS of the peptide mixture from (c). HPLC conditions were the same as in Figure 2.4.

54

= 0.97 Figure 2. 5. m/z

1079.4 1061.4 Molar ratio (from HPLC peak area) peak HPLC ratio (from Molar 1061.1) (m/z Area cyclic 1079.4) (m/z Area linear Cyclic Linear Cyclic

d

% B % % B % B % 100.00 80.00 60.00 40.00 20.00 0.00 100.00 80.00 60.00 40.00 20.00 0.00 100.00 80.00 60.00 40.00 20.00 0.00 35.00 35.00 35.00 30.00 30.00 30.00 1331.6 (M + Pbf) 1331.6 (M 25.00 25.00 25.00 2122.6 (dimer) m/z m/z 1061.1 1061.1 m/z m/z epimer) (putative 20.00 20.00 20.00 Min Min Min 15.00 15.00 15.00 10.00 10.00 10.00 1079.4 m/z 5.00 5.00 5.00 Calcd M = 1079.60 M Calcd a peptide linear b peptide cyclic Calcd M = 1061.58 M Calcd c cyclic + linear 0.00 0.00 0.00

0.15 0.12 0.09 0.06 0.03 0.00 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.12 0.10 0.08 0.06 0.04 0.02 0.00

280 nm 280 280 nm 280 nm 280

55

Table 2.2. Comparison of the Ionization Efficiencies of Linear and Cyclic Peptides in MALDI MS.

The 1.33:1 (mol/mol) mixture of cyclo(AVWmeFRRVQ) and the corresponding linear peptide (from Figure 2.4) was dissolved in 0.1% TFA solution to make 36 μM and 18 μM solutions. Each solution was mixed with two volumes of saturated 4-hydroxy-α- cyanocinnamic acid in acetonitrile/0.1% TFA and spotted 10 times onto a MALDI sample plate (1 μL each spot). The samples were analyzed in a Bruker Reflex III MALDI-TOF instrument in an automated format. The ratios of cyclic/linear peptide ionization efficiencies were calculated from the MALDI peak areas. The “average” ratio was the mean from 10 parallel experiments. An “average*” value was also calculated after excluding the highest and lowest values and corrected by a factor of 1.33 (cyclic/linear peptide molar ratio). The same experiments were repeated for cyclo(AVWfFRRVQ) and the linear counterpart (from Figure 2.5).

56

Table 2.2

57 2.2.3 Sequence Determination of Cyclic Peptides by PED/MS

Forty beads were randomly selected from the cyclic peptide library I, placed in two separate reaction vessels (20 beads each), and subjected to seven cycles of PED [5, 6], which converted the linear encoding peptide on each bead into a series of progressively shorter peptides (Figure 2.6a). The beads were then separated into individual microcentrifuge tubes and the peptides were released by cleavage with CNBr and analyzed by MALDI-TOF MS. A total of 37 beads (92%) produced spectra of sufficiently high quality to allow their unambiguous sequence assignment (Table 2.3).

The MS spectra of the other 3 beads missed one or more peaks, preventing reliable sequence assignment. Figure 2.6b shows a representative MS spectrum derived from a single positive bead selected for binding against streptavidin (vide infra). The cyclic peptide produced an intense peak at m/z 1214.6, whereas the full-length linear peptide bearing an N-terminal nicotinoyl group generated a peak at m/z 1337.7. The truncated peptides gave a series of peaks at m/z 1280.6, 1179.6, 1042.5, 945.49, 817.42, and 746.40.

From the masses of this peptide ladder, the sequence of the cyclic peptide was determined as cyclo(GTHPQALE)BBRM [5, 6]. We have applied this method to sequence hundreds of cyclic peptides from library II, which contained both naturally occurring as well as unnatural amino acids and the success rate has typically been ~90%.

58 a A A X L X L NH NH X X X X X O X G NH O G NH O NH-BBRM O NH-BBRM H2N-GXXXXALEBBRM Nic-GXXXXALEBBRM H2N-GXXXXALEBBRM 1) 5:1 PITC/Nic-OSU H2N-XXXXALEBBRM H2N-GXXXXALEBBRM H2N-XXXXALEBBRM H2N-GXXXXALEBBRM H2N-XXXXALEBBRM H2N-GXXXXALEBBRM 2) TFA H2N-XXXXALEBBRM H2N-GXXXXALEBBRM H2N-XXXXALEBBRM H2N-GXXXXALEBBRM H2N-XXXXALEBBRM repeat 6x A A X L X L NH NH X X X X X O X O G NH O G NH NH-BBRM* O NH-BBRM Nic-GXXXXALEBBRM* CNBr Nic-GXXXXALEBBRM Nic-XXXXALEBBRM* Nic-XXXXALEBBRM MS Nic-XXXALEBBRM* Nic-XXXALEBBRM Nic-XXALEBBRM* Nic-XXALEBBRM Nic-XALEBBRM* Nic-XALEBBRM Nic-ALEBBRM* Nic-ALEBBRM Nic-LEBBRM* Nic-LEBBRM

b 1214.6 Peptide = GTHPQALEBBRM 1337.7 945.49 cyclic

817.42

1179.6 1042.5 1280.6

746.40 A Q P H T G

700 800 900 1000 1100 1200 1300 1400 m/z

Figure 2.6. Partial Edman Degradation of Cyclic Peptide Library. (a) Partial Edman degradation of resin bound peptide. PITC, phenylisothiocyanate; Nic- OSU, N-hydroxysuccinimidyl nicotinate; M*, homoserine lactone. (b) MALDI-TOF mass spectrum of the peptide and its degradation products from a colored bead.

59

Trial Selected against No. of beads analyzed No. of complete No. SA-AP? by PED/MS sequences obtained

1 no 20 17 (85 %) 2 no 20 20 (100 %) 3 yes 11 7 (64 %) 4 yes 11 10 (91 %) Total (average) 62 54 (87%) Table 2.3. Success Rate for Sequencing Resin-Bound Cyclic Peptides by PED/MS.

60 2.2.4 On-Bead Screening for Streptavidin Binding Ligands

As a proof of principle, an aliquot of the library I (50 mg, ~143,000 beads) was

screened for binding to streptavidin, which had been conjugated to alkaline phosphatase

(SA-AP). Since the protein conjugate (~200 kDa) is too large to diffuse into the TentaGel

beads [132], only cyclic peptides on the bead surface should interact with the protein (no

interference from linear peptides). Binding of streptavidin to a bead recruited alkaline phosphatase to the bead surface, and subsequent incubation in the presence of BCIP produced an intense turquoise color on the bead [1]. Screening of the octapeptide library against 5 nM streptavidin resulted in 22 colored beads. PED/MS analysis of the beads gave 17 complete sequences (77%). The lower success rate for the streptavidin-binding sequences was due to their high contents in proline and tryptophan residues, which produced more complex MS spectra [5, 6]. The MS spectrum derived from one of the colored beads is shown in Figure 2.6b, from which a sequence of cyclo(GTHPQALE)BBRM was obtained. Inspection of the 17 sequences revealed that

streptavidin recognized two different consensus sequences, HP(Q/Y) and WYX (Table

2.4). Two of the peptides, (GTHPQALE)BK and (GWYHNALE)BK, as well as their

linear counterparts were individually synthesized and tested for binding to SA-AP. In an

SA-AP pull-down assay, TentaGel derivatized with cyclic peptide (GTHPQALE)BK

retained significantly more alkaline phosphatase activity than the control resin

(underivatized TentaGel), confirming the ability of the cyclic peptide to bind SA-AP

(Figure 2.7a). Our attempts to determine the dissociation constant by surface plasmon

resonance failed, due to weak binding affinity (KD >10 μM). SA-AP pull-down assays in

61 the presence of the cyclic peptides as competitors gave IC50 values of ~100 and ~700 μM

for peptides cyclo(GTHPQALE)BK and cyclo(GWYHNALE)BK, respectively (Figure

2.7b). The corresponding linear peptides (GTHPQALEBK and GWYHNALEBK) were

less effective competitors (IC50 >1 mM). Other investigators have previously reported tripeptide HPQ as a specific streptavidin-binding ligand.

Bead # Peptide Bead # Peptide sequence sequence

1 GCHPQALE 11 GWYCIALE

2 GHPQCALE 12 GWYCLALE

3 GHPQYALE 13 GWYHIALE

4 GIHPQALE 14 GWYHNALE

5 GMHPQALE 15 GWYQLALE

6 GTHPQALE 16 GWYTHALE

7 GWHPQALE 17 GYYYKALE

8 GNHPYALE

9 GRHPYALE

10 GYHPYALE

Table 2.4. Cyclic Peptides Selected against Streptavidin.

C, (S)-2-aminobutyric acid.

62 a

0.15

0.10 405 A 0.05

0.00 12

b

0.06 cyclo(GTHPQALE)BK 0.05 GTHPQALEBK 0.04 cyclo(GWYHNALE)BK GWYHNALEBK

405 0.03 A 0.02

0.01

0 0.00.20.40.60.81.0 [competitor peptide] (mM)

Figure 2.7. Binding of SA-AP to Immobilized Peptide Cyclo(GTHPQALE). (a) Amount of SA-AP retained by underivatized TentaGel S NH2 resin (1) and the affinity resin containing cyclo(GTHPQALE) (2); Data shown are the mean ± SD from six sets of experiments. (b) Binding of SA-AP in the presence of increasing concentration of free cyclic and linear peptides.

63 2.2.5 On-Bead Screening for Porcine α-Amylase Inhibitors

We chose porcine α-amylase as our second target for the cyclic peptide library.

Initially, an aliquot of the library I (40 mg) was screened against biotin-labeled porcine

α-amylase, in a similar manner as the streptavidin binding ligands were screened. Most of the selected peptides showed the similarity toward peptide sequences from streptavidin binding ligands, indicating there may not be a strong ligand for α-amylase. The screening of the library II against biotin-labeled porcine α-amylase revealed a different type of sequences (Table 2.5). Nevertheless, the screening was not repeatable, and the selected peptides did not show inhibition against the enzyme.

We decided to screen the library II against α-amylase that is labeled with fluorescence tag such as fluorescein or Texas Red. That way, we could avoid beads washing steps during the screening, not disturbing the binding equilibrium between the beads and protein. In addition, we screened a large number of beads (~ 400 mg or 1.1 million beads at a time) as the diversity of the library is huge. Twenty four beads were selected after repeated screenings using α-amylase labeled with Texas Red or fluorescein.

Those selected beads were subjected to PED and 15 sequences were recovered. Even though there were only 15 sequences selected from 1.1 million beads (0.001%), they showed a strong consensus, a dipeptide Trp-Mpa or Trp-Fpa followed by positively charged residue Arg or Lys (Table 2.6). The low success rate of sequencing (~63%) was possibly caused from the existence of Trp residue next to Mpa/Fpa residue. Four peptides, cyclo(AWfFYFKVQ), cyclo(AWmeFYFKVQ), cyclo(AVWfFRRVQ), and cyclo(AVWmeFRRVQ) were synthesized based on the sequence information, and one of 64 peptides, cyclo(AWfFYFKVQ) showed an inhibition against α-amylase with the IC50 value of 5 μM, whereas other peptides did not show significant inhibition.

0.2

0.15

0.1 A540 nm

0.05

0 05101520 [cyclo (AWfFYFKVE)NH ] (μM) 2

Figure 2.8. Inhibition of α-Amylase by Cyclic Peptide Cyclo(AWfFYFKVE)NH2.

65

Trial #1 Trial #2 A f meF fF n e V E A fF T Orn fF H V E A f meF H T D V E A H Phg R meF l V E A f meF meF Y n V E A H R H meA meL V E A f meF Sar v e V E A H R meA H Sar V E A f meF Y Q T V E A H R T H f V E A Phg Nle f meF SarV E A I K H R Y V E A S Phg f meF D V E A K Y l H R V E A Q f meF f e V E A meF R l H n V E A e Q f meF G V E A n R meF f H V E A Q f meF I v V E A Nle R H P Q V E A v Q f meF meFV E A Nle Sar f Q n V E A Q f meF P G V E A Orn f H f R V E A e Q f meF SarV E A Phg Y H R fF V E A T Q f meF SarV E A Q T R f H V E A e R Q f meF V E A R l F meL H V E A l Sar Q f meF V E A R fF P Y H V E A e Ser f meF Y V E A R H G fF H V E A l Y f meF D V E A R H W Y l V E A Sar W H Sar R V E A Y H R G n V E Table 2.5. Selected Sequences from Library II against Biotin-Labeled α-Amylase. In trial #1, biotin-labeled α-amylase premixed with SA-AP was incubated with an aliquot of resin for the screening (part of the sequences), in trial #2, biotin-labeled α-amylase was first incubated with the resin, then SA-AP was added for staining; e, f, l, n, v : D- amino acids; me: Nα-methylated amino acids; fF: 4-fluorophenylalanine; Nle: norleucine; Orn: ornithine; Sar: sarcosine; Phg: phenylglycine.

66

Peptide sequences cyclo(A X X X X X V E) A W fF# Y f K V E A W meF G Y Y V E A v W meF# R R V E A W meF Y H Q V E A W meF P T Q V E A l Asn H Phg Q V E A W fF T R I V E A f Nle Y K Q V E A W meF R K fF V E A P H I K Nle V E A W meF meA R H V E A v Y T Y H V E A G f Orn f S V E A W meF Phg H D V E A W meF I Y N V E Table 2.6. Selected Sequences against Fluorescence-Labeled α-Amylase. l, n, v: D-amino acids; me: Nα –methylated amino acids; fF: 4-fluorophenylalanine; Nle: norleucine; Orn: ornithine; Phg: phenylglycine; sequences in bold, individually synthesized for the inhibition against α-Amylase; # as the mass spectra were unclear due to the oxidation of Trp residue, both peptides containing fF and meF were synthesized separately.

67 2.3 Discussion

We have developed a general method for the rapid sequencing of cyclic peptides

derived from combinatorial libraries. The method introduced here is compatible with on-

bead screening. Compared to some of the other encoding methods [133], our method has the advantage of being effectively a direct method, since the cyclic and linear peptides

always have the same sequence and MS analysis provides information on both the

identity and quantity (semi-quantitatively) of a cyclic peptide on each bead. Our method

has a typical success rate of ~90% (defined as unambiguous sequence assignment at all

positions), and does not generate any incorrect assignments other than those caused by

human error. The incomplete sequences due to missing peaks do not compromise the

reliability of the assigned sequences. Our method worked very well with microbeads and

macrobeads in case of solution-phase screening libraries (vida infra). Another advantage of our method over conventional MS/MS methods is its ability to differentiate amino acids of degenerate masses [5, 6]. This feature was useful for screening cyclic peptide libraries containing L-, D-, and Nα-methylated amino acids, many of which were

degenerate in mass (e.g., Ala, Sar; Ile, Nle, Leu; Orn, Asn; Gln, Lys). Most importantly,

our method has a very high throughput capability. We can routinely sequence >100

cyclic peptides in a single day and at an average cost (reagents and instrument time) of

<$1 per peptide. Our method does not require a dedicated MS instrument and the MS

analysis can be performed on any MALDI-TOF instrument of sufficient sensitivity and in

an automated format. Therefore, it can be readily practiced in any chemical or

biochemical laboratories. Our method in its current format is limited to the sequence

68 determination of cyclic peptides from synthetic libraries and does not work with cyclic

peptides isolated from natural sources.

One of the benefits from synthetic cyclic peptide libraries is that many

nonproteinogenic amino acids can be included during the synthesis. It was clearly shown

in the library II that we can incorporate those nonproteinogenic amino acids and the sequence identification is efficient. Furthermore, those cyclic peptides selected from the screening indeed contained several nonproteinogenic amino acids, demonstrating the power of synthetic cyclic peptide library.

On-bead screening of the cyclooctapeptide library I identifies two types of sequence motifs that can bind to streptavidin, HP(Q/Y) and WYX. The HPQ motif has been repeatedly selected by streptavidin in other studies [1, 56, 134, 135]. Our cyclic peptides bind to streptavidin with high micromolar KD values, similar to those reported

for linear HPQ sequences [1, 134, 135]. Others have reported that cyclization of HPQ

sequences resulted in much higher binding affinities (KD in the nM range) [56, 135]. The

discrepancy between earlier reports and our results suggests that the nature of cyclization

(e.g., ring size) can dramatically affect the binding affinity of a cyclic peptide ligand,

which can be either higher or lower than that of the corresponding linear peptide.

On-bead screening of both libraries I and II for an enzyme α-amylase provided many

interesting features. In our initial attempts to screen inhibitors for the enzyme, we used

biotin-labeled α-amylase. Interestingly, many of HPQ containing peptides were selected

from the library I, probably through the binding of beads to SA-AP, not to the biotin-

labeled enzyme, α-amylase. The peptide sequences selected from the library II were more confusing. Those sequences were quite different from the known sequences for 69 SA-AP binding HPQ, and showed a very strong consensus each time. Though we have

not confirmed the binding of these peptides toward streptavidin, most likely they bind to

streptavidin during the screening process. It is interesting that HPQ sequences do not appear from the library II, while those sequences certainly exist in the library. Perhaps the new sequences have the better binding affinity than HPQ containing sequences.

These sequences are nonetheless useful, as the sequences from other screenings can be compared for validation purpose. While we have not been successful with this method for cyclic peptide libraries probably due to the interference of screening by SA-AP, it does not negate the utility of the method, as many of screenings were successfully performed in our laboratory.

In contrast to the screening of beads with biotin-labeled proteins, the screening with fluorescence-labeled enzyme has yielded useful information to date. In a large scale screening of cyclic peptide library II, we could select cyclic peptides that bind to α-

amylase. While there were only 15 peptide sequences obtained from the screening, there was a strong consensus among those few sequences. Indeed, one of the peptides, individually synthesized from those sequences, showed an inhibitory activity toward the enzyme. Initially, our attempt to develop cyclic peptide inhibitor against α-amylase was

prompted by a recent report by Doleckova-Maresova and co-workers [136]. They

synthesized a linear octapeptide inhibitor of α-amylase by de novo design. First, they

screened 400 sublibraries with 2 fixed amino acids at the N-terminus, and searched for

the inhibition against porcine pancreatic α-amylase. Starting from these 2 fixed residues

in an octapeptide library, they developed inhibitors by restricting one residue at a time until they found one peptide sequence with the potency surpassing that of natural 70 oligosaccharide, acarbose. While their method relied on iterative deconvolution with

soluble peptide library utilizing solution-phase assay, we relied on on-bead screening of

large number of cyclic peptide library members. As seen in acarbose [137], it is

reasonable that the inhibitors of α-amylase have a binding affinity toward the enzyme.

Not relying on solution-phase screening we could select a peptide that inhibits the enzyme, taking advantage of high-throughput screening of cyclic peptides in a short period of time. There remain, though, several challenges. Unlike solution-phase screening that directly measures inhibitory activity of library members, on-bead screening allows the selection of binding beads that are not necessarily the inhibitors.

The selection of the cyclic peptide cyclo(AWfFYFKVQ) encouraged us to develop second, and third generation of libraries (Figure 2.9) based on sequence information obtained from our screening results and other documented inhibitors against α-amylase.

Those attempts, unfortunately, failed to improve the original peptide. As described above, only one peptide showed inhibition while other individually synthesized peptides did not show significant inhibition. However, it should be noted that even with the on-bead screening relying on the binding, not necessarily inhibition, we could select an inhibitor against α-amylase. As our method allows both on-bead screening and solution-phase

screening, it is possible that solution-phase screening would help improve the “hits” from

the initial on-bead screening.

Another remaining challenge in the on-bead screening is how the specific ligands are

selected from the screening. As mentioned above, screening against biotin-labeled

proteins often selected the peptides with nonspecific binding either of SA-AP, or of

unknown origins. This was the true when fluorescence-labeled protein was used in some 71 cases. Noticeably many positively charged residues, Arg and Lys, are often seen from

the selected peptides when there is no clear consensus is observed. These might be simply caused from the lack of strong binding ligands in the library. However, alteration of screening method, in some cases, has solved the problem indicating that it was not the absence of good ligands in the library. There are a few examples where the inclusion of the known inhibitors for the target enzyme helped find the selective inhibitors from the on-bead screening [68].

In parallel with the cyclic peptide libraries for on-bead screening, the cyclic peptide libraries for solution-phase screening was developed in our laboratory. The power of the method was demonstrated in the development of tyrocidine analogs with improved antibacterial activity and lower hemolytic effect as reported in 2006 [138] and 2007 [139].

Utilizing TentaGel macrobeads (280-320 μm, ~4 nmol/bead), cyclic peptide libraries of

1716 different compounds were synthesized, and the peptides released from the resin

were used for screening. The diversity from these libraries is lower in several orders of

magnitude than the libraries for on-bead screening, due to the limitation from the way

each bead is assayed individually. Nevertheless, this method is much more powerful than

the parallel synthesis method; only 192 peptides could be tested with the parallel synthesis approach for the improvement of the same compound from other group [27].

Our results so far suggest that combinatorial synthesis and screening can be an effective method to find the ligands for biological targets or enzyme inhibitors, improve the biological activity of natural cyclic peptides.

72 2.4 Conclusion

In conclusion, we have developed an effective method for sequence determination of

library-derived cyclic peptides. This method applies to both on-bead screening and

solution-phase screening of synthetic cyclic peptides for biological activities. We have

demonstrated our method in the search for the ligands for streptavidin, enzyme inhibitor

against α-amylase. Development of cyclic peptide ligands for several biological targets is already under way, and this should further expand the utility of cyclic peptides in biomedical research and drug discovery.

73 a Cyclo(X X X W X X X Q E)-BBRM-bead Y, R, T, G, P Y, R, meF, fF X: 26 amino acids Y, D-F, R, E

X Q X E X W XX Q X X X E Q Q X X W X X O E E MRBB-NH X W X X W X MRBB-NH O X MRBB-NH O MRBB-NH O

26 x 5 x 4 x 26 x 26 x 4 = 520 = 13,520 = 351,520 = 1,406,080

b

K f V NH Y f F W O A NH O NH-BBRM

15 % split for the original sequences Inclusion of stereoisomer for each position 2 g resin, 5 random positions

Identical : 0.155 = 0.007% (~450 copies) 1 point-mutant: 5 x 0.154 x 0.85 = 0.2% (~100 copies) 2 point-mutant: 10x 0.153 x 0.852 =2.4% (~24 copies) 3 point-mutant: 10x 0.152 x 0.853 =13.8% (~5 copies) 4 point-mutant: 5 x 0.15 x 0.854 =46.1% (~1 copies) 5 point-mutant: 0.855 = 44.4%

Figure 2.9. Modified Cyclic Peptide Libraries for α-Amylase Inhibitors. (a) Cyclic peptide library with different ring sizes. (b) A biased library based on the peptide sequence cyclo(AWfFYFKVQ). underlined residues were randomized.

74

HN Q Y O X O X L O M-NH HN dF O dF F P (90%) M-K-dFFPdFLXXYQ-NH2 R (10%) ~4 nmol/bead Figure 2.10. Design of Cyclic Peptide Library for Tyrocidine A Analogs. Cyclic peptide is synthesized on an ester linker that can be cleaved by a base. The encoding sequence stays on the resin until CNBr cleavage after PED. Base cleavage releases the cyclic peptide for solution-phase screening while the encoding tag is still attached to the bead. X: randomized residues with 20 amino acids (diversity 20 x 20 = 400). Contribution from Dr. Qing Xiao.

75 2.5 Experimental Sections

2.5.1 Materials

Fmoc-protected L-amino acids were purchased from Advanced Chemtech

(Louisville, KY), Peptides International (Louisville, KY), or NovaBiochem (La Jolla,

CA). Nα-Fmoc-L-glutamic acid α-allyl ester (Fmoc-Glu-OAll) was from NovaBiochem.

O-Benzotriazole-N,N,N’,N’-tetramethyluronium hexafluorophosphate (HBTU), 1-

hydroxybenzotriazole hydrate (HOBt), and Fmoc-mini-PEG were from Peptides

International. Benzotriazole-1-yloxy-tris-pyrrolidino-phosphonium hexafluorophosphate

(PyBOP) was from NovaBiochem. All solvents and other chemical reagents were

obtained from Aldrich (Milwaukee, WI), Fisher Scientific (Pittsburgh, PA), or VWR

(West Chester, PA) and were used without further purification, unless noted otherwise.

N-Hydroxysuccinimidyl nicotinate (Nic-OSU) was from Advanced ChemTech and was

recrystallized from ethyl acetate prior to use. Phenyl isothiocyanate (PITC) was

purchased in 1-mL sealed ampoules from Sigma-Aldrich and a freshly opened ampoule

was used in each experiment. Streptavidin-alkaline phosphatase (SA-AP) conjugate (~1 mg/mL) was purchased from Prozyme (San Leandro, CA). TentaGel S NH2 resin (90 μm,

0.26 mmol/g, and ~100 pmol/bead) was purchased from Peptides International. Rink

Resin LS (100-200 mesh, 0.2 meq/g) was purchased from Advanced ChemTech. 5-

Bromo-4-chloro-3-indolyl phosphate (BCIP) disodium salt was from Sigma (St. Louis,

MO). p-Nitrophenyl phosphate sodium salt was from Research Organics (Cleveland,

OH). Olympus SZX12 Research stereo microscope (Olympus America, Center Valley,

PA) equipped with fluorescence illuminator was used for screening fluorescence beads. 76 α 2.5.2 Synthesis of N -Fmoc-Glu(δ-N-hydroxysuccinimidyl)-O-CH2CH=CH2

NHS ester of Fmoc-Glu (OAll) was synthesized by Dr. Bhaskar Gopishetty. Fmoc-

Glu-OAll (1.0 g, 2.4 mmol) and N-hydroxysuccinimide (0.34 g, 2.9 mmol) were

dissolved in 40 mL of dichloromethane (DCM) and the mixture was stirred vigorously

for 30 min at room temperature to dissolve most of the N-hydroxysuccinimide. Then,

0.55 g of dicyclohexylcarbodiimide (2.7 mmol) was added and the reaction was allowed

to proceed at room temperature overnight. The mixture was filtered to remove the white

precipitate (N,N’-dicyclohexylurea) and the filtrate was washed with saturated NaHCO3 solution, brine, and dried over MgSO4. The solvent was removed under reduced pressure

and the crude product was purified by column chromatography (80% ethyl acetate in

1 hexane) to afford a white solid (0.85 g, 69%). H NMR (250 MHz, CDCl3) δ 2.05-2.21

(m, 1H), 2.26-2.44 (m, 1H), 2.60-2.75 (m, 2H), 2.78 (br s, 4H), 4.21 (t, J = 6.8 Hz, 1H),

4.32-4.53 (m, 3H), 4.64 (d, J = 5.7 Hz, 2H), 5.21-5.39 (m, 2H), 5.62 (d, J = 8.2 Hz, 1H),

5.80-5.99 (m, 1H), 7.27-7.44 (m, 4H), 7.60 (d, J = 7.6 Hz, 2H), 7.76 (d, J = 7.6 Hz, 2H);

13 C NMR (63 MHz, CDCl3) δ 25.5, 27.3, 47.1, 53.0, 66.3, 67.0, 119.3, 119.9, 125.0,

+ 127.0, 127.7, 131.1, 141.2, 143.6, 167.8, 169.0, 170.8; HRESI-MS: C27H26N2O8Na ([M

+ Na]+), calcd 529.1581, found 529.1566.

2.5.3 Synthesis of Cyclic Peptide Libraries

The cyclic peptide library I, cyclo(GXXXXALE) was synthesized on 2.0 g of

TentaGel S NH2 resin (90 μm, 0.26 mmol/g). All of the manipulations were performed at

room temperature unless otherwise noted. The linker sequence (BBRM) was synthesized

77 with 4 equiv of Fmoc-amino acids, using HBTU/HOBt/N-methylmorpholine (NMM) as

the coupling reagents. The coupling reaction was typically allowed to proceed for 2 h and

the beads were washed with DMF (3x) and DCM (3x). The Fmoc group was removed by

treatment twice with 20% piperidine in DMF (5 + 15 min) and the beads were

exhaustively washed with DMF (6x). To spatially segregate the beads into outer and

inner layers, the resin was treated with 20% piperidine in DMF (5 + 15 min), washed

with DMF and water, and soaked in water overnight. The resin was drained and

α suspended in a solution of N -Fmoc-Glu(δ-N-hydroxysuccinimidyl)-O-CH2CH=CH2

(0.26 mmol, 0.50 equiv) and diisopropylethylamine (1.2 mmol or 2.0 equiv) in 30 mL of

55:45 (v/v) DCM/diethyl ether. The mixture was incubated on a carousel shaker for 30 min at room temperature. The beads were washed with 55:45 DCM/diethyl ether (3x) and

DMF (8x) to remove water from the beads and then treated with 2 equivalents of Fmoc-

Glu(tBu)-OH plus HBTU/HOBt/4-methylmorpholine in DMF (90 min). Next, the dipeptide Ala-Leu was added to the resin using standard Fmoc/HBTU chemistry. For the synthesis of random positions, the resin was split into 20 equal portions and each portion

(100 mg) was coupled twice with 5 equiv of a different Fmoc-amino acid/HBTU/HOBt/NMM for 2 h.

To differentiate isobaric amino acids during MS sequencing, 5% (mol/mol) Ac-

Gly was added to the coupling reactions of Leu and Lys, whereas 5% Ac-Ala was added to norleucine reactions [5, 6]. After the four random positions were synthesized, a glycine was added to the N-terminus of all peptides to facilitate the cyclization reaction. Next, the allyl group on the C-terminal glutamate was removed with a solution containing tetrakis(triphenylphosphine)palladium (1 equiv), triphenylphosphine (3 equiv), formic

78 acid (10 equiv), and diethylamine (10 equiv) in anhydrous THF overnight at room

temperature. Anhydrous THF was obtained using Solvent Purification System from Solv-

Tek (Berryville, VA). The beads were sequentially washed with 0.5%

diisopropylethylamine in DMF, 0.5% sodium dimethyldithiocarbamate hydrate in DMF,

DMF (3x), DCM (3x), and DMF (3x). The N-terminal Fmoc group was then removed in

20% piperidine and the beads were washed with DMF (6x), 1 M HOBt in DMF (3x),

DMF (3x), and DCM (3x). For peptide cyclization, a solution of PyBOP/HOBt/NMM (5,

5, 10 equiv, respectively) in DMF was mixed with the resin and the mixture was

incubated on a carousel shaker for 3 h. The resin was washed with DMF (3x) and DCM

(3x) and dried under vacuum for >1 h. Side chain deprotection was carried out with

reagent K (6.5% phenol, 5% water, 5% thioanisole, 2.5% ethanedithiol, 1% anisole, and

1% triisopropylsilane in TFA) for 1 h. The resin was washed with TFA and DCM, and

dried under vacuum before storage at -20 °C.

The cyclic peptide library II, cyclo(AXXXXXVE) was synthesized on 5.0 g of

TentaGel S NH2 resin with the same strategy as the synthesis of library I except

following. In the coupling of amino acids at Nα-methylated ends, amino acids with

HATU/NMM coupling was used at high concentration up to 12 equiv instead of

HBTU/HOBt/NMM coupling. To differentiate isobaric amino acids during MS sequencing, 5% (mol/mol) of -d4(CD3CO2D) was added to the coupling

reactions of Lys, Leu, Orn, and 5% of propionic-2,2-d2 acid-d (CH3CD2CO2D) was added

to Nle reactions [7].

79 2.5.4 Determination of Cyclization Efficiency and Molar Ratio of Cyclic/Linear

Peptides

To determine the yield of cyclization, ~10 mg of resin before or after cyclization (but

prior to side chain deprotection) was treated with excess benzylamine (BnNH2) and

PyBOP/HOBt/4-methylmorpholine (16 equiv) for 1.5 h. Afterwards, the resin was subjected to side chain deprotection with reagent K and 50 beads were randomly selected and placed into individual microcentrifuge tubes. The peptide on each bead was released from resin by cleavage with CNBr and analyzed by MALDI-TOF MS (see below for detailed procedure). Ninhydrin assay was performed by the addition of 300 μL of 76% phenol in ethanol (w/v), 300 μL of 20 μM KCN in pyridine, 300 μL of 5% ninhydrin in

ethanol (w/v) to ~1.0 mg of resin and incubating at 110 °C for 5 min. The absorbance at

580 nm was measured on a Perkin-Elmer Lambda 25 UV/visible spectrometer and the

background was subtracted from the absorbance of the mixture without resin.

2.5.5 Peptide Sequencing by PED/MS.

Selected resin beads from the library I (usually anywhere from 1 to ~ 100 beads)

were pooled and subjected to partial Edman degradation in a single reaction vessel [5, 6].

The beads were suspended in 160 μL of pyridine/water (v/v 2:1) containing 0.1%

triethylamine and mixed with an equal volume of 0.2% (w/v) Nic-OSU and 5–9% (v/v)

PITC in pyridine. The reaction was allowed to proceed for 6 min with mixing and the

beads were washed with methanol, DCM, and TFA. The beads were treated twice with

~300 μL of TFA for 6 min each. After washing the resin with DCM, pyridine, and

80 pyridine/water (2:1), the cycle was repeated for n times, where n equals to the number of

residues to be sequenced. During the final cycle, the N-terminal amine was treated with

Nic-OSU in the absence of PITC.

For the peptide sequencing for the beads from the library II, Fmoc-OSU replaced

Nic-OSU as the newly developed PED method was used [7]. All the procedures are the

same except that Fmoc-OSU (2.5 μmol) and PITC (100 μmol) were used during the

degradation steps.

2.5.6 MALDI-TOF Analysis of the Peptide

For MALDI-TOF analysis, the degraded beads were treated with ~1 mL of TFA

containing ammonium iodide (10 mg) and dimethylsulfide (20 μL) on ice for 30 min to reduce any oxidized Met. After washing with water, the beads were transferred into microcentrifuge tubes (1 bead/tube) and each treated with 20 μL of 70% TFA containing

CNBr (20 mg/mL) overnight in the dark. The solvents were evaporated under vacuum to

dryness and the peptides released from the bead were dissolved in 5 μL of 0.1% TFA in water. One μL of the peptide solution was mixed with 2 μL of saturated 4-hydroxy-α- cyanocinnamic acid in acetonitrile/0.1% TFA (1:1) and 1 μL of the mixture was spotted

onto a MALDI sample plate. When analyzing the individually synthesized peptides,

peptide in the crude or in the HPLC fraction was dissolved in 0.1% TFA before mixing

with the matrix. Mass spectrometry was performed at Campus Chemical Instrument

Center of The Ohio State University on a Bruker III MALDI-TOF instrument in an

automated manner. The data obtained were analyzed by either Moverz software

81 (Proteometrics LLC, Winnipeg, Canada) or Bruker Daltonics flexAnalysis 2.4 (Bruker

Daltonic GmbH, Germany).

2.5.7 Library Screening for Streptavidin Binding

In a micro-BioSpin column (0.8 mL, Bio-Rad), 30 mg of the cyclic peptide library

was swollen in DCM and extensively washed with DMF and then water. The resin was

incubated in a blocking buffer (30 mM Hepes, pH 7.4, 150 mM NaCl, 0.01% Tween 20,

and 0.1% gelatin) overnight with gentle mixing at RT. The resin was drained and

resuspended in 800 μL of a screening buffer (30 mM Hepes, pH 7.4, 150 mM NaCl, and

0.01% Tween 20) containing ~5 nM SA-AP (1:1000 dilution of commercially available solution) for 2.5 h incubation at 4 °C. The resin was drained and washed twice with 800

μL of SA-AP buffer (30 mM Tris-HCl, pH 7.4, 250 mM NaCl, 10 mM MgCl2, and 70

μM ZnCl2) and twice with 800 μL of SA-AP reaction buffer (30 mM Tris-HCl, pH 8.5,

100 mM NaCl, 5 mM MgCl2, 20 μM ZnCl2). The resin was transferred into a well on a

12-well plate (BD Falcon) by rinsing with 9 x 100 μL of the SA-AP reaction buffer.

Upon the addition of 100 μL of 5 mg/mL BCIP in the SA-AP reaction buffer, intense

turquoise color developed on positive beads in ~30 min, when the staining was quenched

by the addition of 3 mL of 8 M guanidine hydrochloride. The resin was washed

extensively with water and transferred into a 35-mm Petri dish, from which the positive

beads were picked manually with a pipette under a dissecting microscope. The screening

procedure was repeated once with 20 mg of the cyclic peptide library.

82 2.5.8 Labeling of α-Amylase with Biotin and Texas Red

For the biotinylation of α-amylase, 100 μL of protein (3.2 mg; ~ 60 nmol) supplied

from the vendor was first mixed with 400 μL of NaHCO3 buffer (50 mM, pH 8.5

containing 50 mM NaCl) and then mixed with 2 molar equivalent of (+)-Biotin N-

hydroxysuccinimide ester dissolved in DMSO (10 mg/mL). The mixture was incubated

on the rotor wheel incubator for 30 min at room temperature. After labeling reaction, the

excessive biotin NHS ester was quenched by treating with 50 μL of 1 M Tris buffer (pH

8) for 5 min in cold room with gentle mixing on the rotisserie incubator. The reaction mixture containing the labeled protein was loaded onto G-25 Sephadex column equilibrated with Tris HCl buffer (pH 7.5 containing 100 mM KCl). The biotinylated protein was eluted from the column and then flash frozen for storage until used later.

The concentration of the protein was determined with Bradford protein assay and

UV absorbance at 280 nm. The extinction coefficient for α-amylase was calculated to be

139060 M-1 cm-1 using Protoparam [140], and the difference between two assays was not

more than two fold. SDS-PAGE mobility of the protein was similar before and after the

biotin labeling.

The labeling of protein with Texas Red was done similarly with some exceptions.

The reaction mixture was kept in dark during the labeling reaction. The degree of

labeling could be determined by comparing the absorbance at 280 nm and 595 nm. First,

A595 was measured and the molar concentration of Texas Red was calculated using the

-1 -1 molar extinction coefficient of 80,000 M cm . Then A280 corrected was obtained by subtracting 0.18 x A595 from the A280 observed. The molar concentration of α-amylase was

83 -1 calculated from A280 corrected using the molar extinction coefficient of protein (139060 M cm-1). Typically, ~ 70% or higher labeling efficiency was obtained.

2.5.9 Library Screening for α-Amylase Binding

For the screening of the cyclic peptide library with biotinylated α-amylase, an

aliquot of the library I or II (~ 20-50 mg) was used each time. Overall procedure is

similar to the screening procedure for streptavidin, as described in the paragraph 2. 5. 7

except followings. The resin was incubated in a screening buffer or HBST-gelatin buffer

(30 mM Hepes, pH 7.4, 150 mM NaCl, 0.05% Tween 20, and 0.1% gelatin) containing

either 50-100 nM of biotinylated α-amylase alone or the preformed complex of

biotinylated α-amylase and SAAP (10:1 mixture). After incubation with SA-AP, the

beads were washed several times with SA-AP buffer alone, or SA-AP buffer, HBST-

gelatin buffer, SA-AP buffer sequentially.

For the screening of the cyclic peptide library with α-Amylase labeled with

fluorescent tag, 400 mg of the cyclic peptide library II was swollen in DCM and

extensively washed with DMF and then water in a Poly-prep column (10 mL, Bio-Rad).

The resin was incubated in the 8 mL of HBST-gelatin buffer (30 mM Hepes, pH 7.4, 150

mM NaCl, 0.05% Tween 20, and 0.1% gelatin) for 1 h gently mixing at rt, and then was

poured to a square integrid dish (BD Falcon). A stock solution of α-amylase labeled with

Texas Red was added, and the dish was covered with aluminum foil, and then incubated

in the cold room with gentle mixing on a nutator. The concentration of the enzyme was

increased gradually (2-4 h of incubation each time) from 10 nM until the hit appears on

the fluorescence microscope. At 1 μM concentration of the enzyme, there appeared 84 about two dozens of beads distinctly red color compared to other beads. Those beads

were picked manually and washed with DMF, 8 M guanidine hydrochloride, 5 M NaCl,

DMSO, 0.5 M HCl, and the HBST-gelatin buffer to get rid of red color from Texas Red.

After the wash, the selected beads were subjected to the secondary screening against α-

amylase labeled with fluorescein.

2.5.10 Synthesis of Individual Peptides Binding to Streptavidin or α-Amylase

Each peptide was synthesized on 200 mg of Rink Resin LS (0.2 mmol/g) in a manner

similar to that employed for the library construction except that spatial segregation was

not necessary. After the addition of the last amino acid, the resin was split into two equal

aliquots. One aliquot was used for cyclization whereas the other was used to synthesize

the linear peptide as a control. For the preparation of cyclic peptides, the allyl group on

Glu was first removed and then the Fmoc group was removed, prior to cyclization. The

condition for cyclization was identical to that used during library construction and the

progress of cyclization was monitored by ninhydrin test. After cleavage and deprotection

as previously described, the crude peptides were purified by reversed-phase HPLC on a

C18 column and their identity was confirmed by MALDI-TOF mass spectrometric

analyses.

2.5.11 SA-AP Pull-Down Assay

Cyclic peptide (GTHPQALE) plus a miniPEG linker (Peptides International) and a

C-terminal methionine was synthesized on 90 μm TentaGel S NH2 resin as described

above. Fifteen mg of the resin was suspended in 1.5 mL of blocking buffer (30 mM

85 Hepes, pH 7.4, 150 mM NaCl, 0.01% Tween 20, and 0.1% gelatin) and ~20 μL aliquots

(~200 μg resin) were transferred to individual wells of a 96-well plate (Nalge Nunc,

Rochester, NY). After overnight incubation, each well was supplemented with 150 μL of

a screening buffer (30 mM Hepes, pH 7.4, 150 mM NaCl, and 0.01% Tween 20)

containing 2 nM SA-AP (1:2,500 dilution of the commercial sample) and 0-800 μM competitor peptide. The mixture was incubated for 30 min with gentle shaking. The resin in each well was transferred to a micro-BioSpin column, drained, and quickly washed twice each with 300 μL of SA-AP buffer and 300 μL of SA-AP reaction buffer. Next,

100 μL of SA-AP reaction buffer containing 10 mM p-nitrophenyl phosphate was added

to the resin and the mixture was incubated for 1.5 h with gentle shaking. The reaction was

stopped with the addition of 900 μL of 1 M NaOH, and the absorbance at 405 nm was

measured. In the absence of competitor peptide, the affinity resin containing cyclic

peptide (GTHPQALE) retained significantly higher amounts of SA-AP than

underivatized TentaGel resin (Figure 2.7). The presence of competitor peptides inhibited

the binding of SA-AP onto the affinity resin and the concentration of a competitor

peptide at which 50% of SA-AP binding was inhibited (IC50 value) was estimated from

the binding curves.

2.5.12 α-Amylase Inhibition Assay with Starch as Substrate

The reaction condition for α-amylase inhibition assay was modified from the

instruction manual from the manufacturer (Sigma). After confirming that DMSO

between 1 and 10% does not affect the assay, the crude peptide sample was dissolved in

DMSO to prepare a stock solution of ~ 10 mM. Starch solution (1%) was prepared by 86 boiling 1% potato starch (Sigma S2630) in water for 15 min and the boiled solution was

left at rt to cool. Sodium potassium tartarate tetrahydrate 12.0 g was dissolved in 8.0 mL

of 2M NaOH by heating on heating/stir plate. 3,5-dinitrosalicylic acid (96 mM) 20 mL

was prepared and then the sodium potassium tartarate solution was added while stirring,

and then diluted to 40 mL with water to prepare the Color Reagent Solution. The α-

amylase enzyme was diluted directly from the enzyme stock (Sigma, A6255) into ice

cold water for the final reaction concentration of ~ 1 μM. Each enzyme reaction volume

was set 200 μL, by mixing 2-20 μL of peptide solution in DMSO, 100 μL of 2x reaction buffer containing 20 mM sodium phosphate buffer with 6.7 mM sodium chloride (pH

6.9), 10 μL of enzyme dilute, water, and 20 μL of 1% starch. Every reaction component

except starch was preincubated for 10 min and then starch was added to initiate the enzyme reaction for 3 min. The reaction was stopped by adding 100 μL of the Color

Reagent Solution. The color was developed by incubating the sample tube in the boiling

water for 15 min and then 900 μL of water was added to the sample. The absorbance at

540 nm was obtained from the UV spectrophotometer. Blank was prepared for each

reaction in the same way except that the Color Reagent Solution was added prior to the

addition of starch.

87

CHAPTER 3

SYNTHESIS AND SCREENING OF RESIN-BOUND COMBINATORIAL PEPTIDE LIBRARIES WITH FREE C-TERMINI: DETERMINATION OF THE SEQUENCE SPECIFICITY OF PDZ DOMAINS∗

3.1 Introduction

Combinatorial peptide libraries have been widely used to identify specific binding

ligands against receptors [4, 63-68], define the substrate specificity of enzymes [8, 69-74], and develop new catalysts [75-80]. However, proteins/enzymes that recognize the C-

terminus of another peptide/protein (especially those that require posttranslationally

modified C-termini) remain challenging targets for peptide library screening. As

reviewed in chapter 1, there are many biological libraries generating free-C terminal

peptides (e.g., phage display [101, 102] and lacI repressor [104, 105] or GFP fusion

[109]). These methods are generally limited to 20 proteinogenic amino acids. On the

other hand, synthetic libraries allow the use of modified amino acids or unnatural

building blocks as the method is not limited to the proteinogenic amino acids.

Nevertheless, it has been technically difficult to prepare and screen a support-bound

* Reproduced with permission from Journal of the American Chemical Society, submitted for publication. Unpublished work copyright 2007 American Chemical Society.

88 combinatorial peptide library containing a free C-terminus. The well-established solid- phase peptide synthesis methodologies all start from the C-terminus, which must be covalently attached to the support (C→N synthesis). Attempts to synthesize peptides in

N to C direction are generally met with low yields and racemization problems [82].

Another obstacle is the lack of a reliable method to routinely sequence peptides from the

C-terminus after they are positively identified from a library. To avoid these technical difficulties, Songyang and co-workers synthesized C-terminal peptide libraries on the

solid phase, released the peptides from the resin, and performed screening assays in the solution phase [110]. The enriched peptide pool was sequenced by Edman degradation to

reveal the relative preference for certain amino acids at a given position. This method,

however, does not give individual binding sequences and therefore is unable to reveal any

sequence covariance (e.g., the existence of multiple consensus sequences). Furthermore,

the less abundant peptides with highest binding affinity would not be revealed with this

method. Other investigators have prepared peptide libraries with free C-termini by

“peptide inversion” [112, 116, 117]. In this approach, a peptide is first synthesized in the

conventional C→N manner and attached to a solid support via an ester linkage. Next, the

completed peptide is cyclized between its N-terminus and a C-terminal carboxyl group

installed on a bifunctional linker. Finally, cleavage of the ester linkage releases the free

C-terminus, but the peptide remains attached to the support at its N-terminus. Therefore,

they synthesized inverted peptide libraries on cellulose membranes in a spatially

addressable manner (SPOT synthesis) [116, 117]. As reviewed earlier, the limitations of

SPOT synthesis are low throughput size (~ 103−104) and the requirement of a

sophisticated robotic system for synthesis. Alternatively, Davies and Bradley [113, 114]

89 have synthesized a 1000-member one-bead-one-compound (OBOC) library using the

split synthesis method [1-3] by inverting only 80% of the resin-bound peptides. They

kept the remaining 20% peptides not inverted for sequencing by Edman degradation. A

drawback of this method is that the encoding tags may interfere with library screening by

binding to the screening target. Edman degradation is also expensive and time-

consuming.

Many PDZ domains, as introduced in chapter 1, recognize C-terminal residues for interaction with other proteins to mediate their biological functions. There have been various approaches to define specificity each of PDZ domains using peptide libraries

[101, 102, 104, 105, 109, 110, 116, 117] and theoretical methods [115, 141]. In this work, we have developed a general methodology for the synthesis and screening of OBOC peptide libraries containing free C-termini and applied it to determine the recognition motifs of four PDZ domains. This methodology is readily applicable to all PDZ domains and other proteins and enzymes that recognize the C-termini of their partner proteins/substrates.

3.2 Results

3.2.1 Design Strategy and Synthesis of Free C-Terminal Peptide Library

We adopted the strategy of Davies and Bradley[113, 114] to invert only a fraction of

the peptides on each resin bead (~33%), so that the non-inverted peptides (which have the

same sequence but with a free N-terminus) serve as an encoding sequence to facilitate

later sequence determination. To prevent the encoding peptides from interfering with

library screening, we topologically segregated each resin bead into two different layers; 90 the bead surface would display the inverted peptide containing a free C-terminus,

whereas the inner core would carry the corresponding encoding peptide with a free N-

terminus (Figure 3.1). During library screening against a macromolecular target (e.g., a

PDZ domain), which is too large to diffuse into the bead interior, only the inverted

peptides on the bead surface would have access to the target. After a positive bead is

identified from the library, the identity of the binding peptide would be determined by

sequencing the encoding peptide on the bead by partial Edman degradation/mass

spectrometry (PED/MS) [6, 7], a high-throughput peptide sequencing technique

previously developed in our laboratory.

We tested this strategy on PDZ domains. Since PDZ domains generally make

contacts with the four to five C-terminal residues of their partner proteins, we designed a

peptide library containing five random residues at the C-terminus, resin-

MLLBBERAX5X4X3X2X1-COOH (library I), where B is β-alanine and X1−X5 represent

L-α-aminobutyrate (Abu, as a replacement of cysteine), L-norleucine (Nle, as a

replacement of methionine), or any of the 18 proteinogenic amino acids except for

cysteine and methionine (Figure 3.1). Library I had a theoretical diversity of 205 (or

3,200,000) and was synthesized on TentaGel S resin (90 μm, ~2,860,000 beads/g, ~100

pmol peptide/bead). The terminal methionine permits peptide release by CNBr prior to

MS analysis (vide infra). The two β-alanines were intended to provide some flexibility to

the peptides to facilitate their binding to the PDZ domain. The two leucines were

included to insure that the smallest peptides would have an m/z ratio of >600 in MALDI

mass spectra to avoid signal overlap with the MALDI matrix. Synthesis of library I

started with the addition of the BBLLM linker to the TentaGel resin using standard Fmoc 91 chemistry. Next, the beads were spatially segregated into outer and inner layers using the

method of Lam [128], with the concomitant incorporation of Glu as a bifunctional linker.

Briefly, TentaGel beads bearing the BBLLM linker were soaked in water, drained, and quickly suspended in 55:45 (v/v) dichloromethane/diethyl ether containing 0.33 equivalent of a side chain N-hydroxysuccinimidyl (NHS) ester of L-glutamic acid, Nα-

Fmoc-Glu(δ-NHS)-O-CH2CH=CH2. Because the organic solvent is immiscible with

water, only peptides on the bead surface were exposed to and reacted with the activated

ester. The beads were washed with DMF and the remaining free N-terminal amines in the

inner core (0.67 equivalent) were acylated with Boc-Gly-OH. After removal of the Fmoc

group from surface peptides, a p-hydroxymethylbenzoic acid (HMBA) linker was

coupled to the N-terminal amine. The Boc protecting group was then removed from the

internal peptides, and an Arg was added; the Arg provides a fixed positive charge to the

encoding sequences and greatly facilitates MS analysis. The random region was

synthesized by the split-and-pool method [1-3] to generate an OBOC library. A dipeptide

Arg-Ala was added to the N-terminus; the Arg provides a positive charge to the surface

peptides to facilitate MS analysis, whereas the Ala serves as a spacer to minimize any

potential bias the positively charged Arg may exert on library screening. The α-allyl group on the C-terminal glutamate and the N-terminal Fmoc group were removed by treatment with Pd(PPh3)4 and piperidine, respectively. Subsequent treatment with

benzotriazole-1-yl-oxy-tris-pyrrolidino-phosphonium hexafluoro-phosphate (PyBOP)

cyclized the surface peptides, while the encoding peptides in the bead interior were kept

in the linear form. Finally, hydrolysis of the ester linkage with NaOH exposed the free C-

termini of the surface peptides, which remained attached to the support via their N- 92 termini (Figure 3.1). Note that any uncyclized surface peptides would be cleaved off the resin by the NaOH treatment and therefore the bead surface should contain only the inverted peptides. Deprotection of side chains were carried out with trifluoroacetic acid

(TFA) as usual.

93

BBLLM BBLLM O BBLLM Figure 3.1. Synthesis of Spatially SegregatedRGBBLLM and Inverted Peptide Library. O Boc-GBBLLM 1 X O 2 NH X 3 O X NH O 4 O NH X O O O 5 O; and (l) TFA. Fmoc O 2 N H AR 5 X Fmoc-RAX O 4 c X 3 O HO X 1 2 X X 2 1 X 3 X C-X 4 2 X 5 HO BBLLM N-BBLLM 2 O H k, l Fmoc-RAX NH a O O Fmoc BBLLM RGBBLLM 1 O X 2 X BBLLM 3 O X NH O 4 O X 5 N H -NHS)-O-CH2CH=CH2 in Et2O/CH2Cl2; (c) excess Boc-Gly-OH, Boc-Gly-OH, excess -NHS)-O-CH2CH=CH2(c) in Et2O/CH2Cl2; RA Synthesis of spatially segregated and inverted peptide library. Synthesislibrary. of spatially segregatedpeptide and inverted Fmoc-RGBBLLM O δ NH N-RAX O 2 H 5 N-BBLLM N-BBLLM O 2 2 X O ; (i) piperidine; (j) PyBOP, HOBt; (k) NaOH, H H H 4 4 ) X 1 3 O 3 2 X X X ab HO -Fmoc -Glu( α HBTU; (d) piperidine; (e) HMBA, HBTU; (f) TFA; (g) Fmoc-Arg(Pbf)-OH, HBTU; (h) Pd(PPh Reagents: (a) standard Fmoc/HBTU chemistry; (b) soak in then soak water (b) and chemistry; 0.33 equiv of Fmoc/HBTU standard (a) Reagents: N Figure. 3. 1. h-j d-g N N 2 2 H H

94 3.2.2 Characterization of the Inverted Peptide Library

To assess the quality of the peptide library, resin beads from different stages of the

library synthesis were subjected to MS analysis. First, a small aliquot of resin (~ 2 mg) was withdrawn from the library immediately before the peptide cyclization step and treated with reagent K to remove side-chain protecting groups. Twenty-four beads were randomly selected and the peptide from each bead was released by CNBr cleavage and analyzed by MALDI-TOF MS. Out of the 24 beads, 23 showed a pair of peaks separated by 50 amu (Figure 3.2a), indicating that both surface (m/z M, the pseudo-molecular ion of the inverted peptide) and internal peptides (m/z M - 50) were successfully synthesized.

For some of the beads, the m/z M peak had much lower abundance relative to the m/z (M

– 50) peak. This was due to partial hydrolysis of the ester linkage of the surface peptides during overnight treatment with CNBr, which was dissolved in 70% TFA in water. The hydrolysis product, RAX5X4X3X2X1, gave an intense peak at m/z (M – 713) (Figure 3.2a).

Second, in a parallel experiment, 24 randomly selected beads were subjected to the same treatment as described above, except that the beads were treated with 1 N NaOH solution for 1 h prior to CNBr cleavage and MS analysis. The resulting MS spectrum of each bead showed a single peak in the expected m/z region for the encoding sequences (Figure

3.2b), consistent with complete hydrolysis (and therefore release) of the surface peptides.

Third, 24 beads were randomly selected from the final library (after NaOH hydrolysis

and deprotection with TFA), treated with CNBr, and analyzed by MS. The MS spectra of

23 beads showed a pair of peaks at m/z M and m/z (M – 50) (Figure 3.2c). Since any

uncyclized surface peptides would have been cleaved off the resin by the NaOH

95 treatment (Figure 3.2b), the m/z M peak in Figure 3.2c must be derived from the inverted

peptides on the bead surface (Figure 3.1). The molar ratio of the inverted/encoding

peptides on each bead was estimated from the relative abundance of the m/z (M – 50) and

m/z M peaks, by assuming that the two peptides have equal ionization efficiency in the

MS. The molar ratios of the 24 beads ranged from 0.00 to 1.49, but had an average value

of 0.52 (the theoretical value is 0.50) (Table 3.1). Although for an individual bead, the

estimated molar ratio may not be very accurate (due to potentially different ionization efficiency of the inverted vs. encoding peptides), collectively the above data indicate that

peptide cyclization and NaOH hydrolysis occurred at high yields and that substantial amounts of inverted peptides were present on each bead surface. We have previously shown that on-bead peptide N-to-C cyclization reaction is highly efficient (nearly quantitative) in chapter 2 [138].

96 a M – 713 M-50 1440.0 776.6

M 1489.9

600 800 1000 1200 1400 1600 600 800 1000 1200 1400 1600 m/z

b M-50 1402.0

600 800 1000 1200 1400 1600 600 800 1000 1200 1400 1600 m/z

c 1438.1 M-50 M 1488.0

600 800 1000 1200 1400 1600 600 800 1000 1200 1400 1600 m/z

Figure 3.2. MALDI-TOF Mass Spectra of Library I Peptides at Different Stages of Synthesis. (a) Prior to cyclization and no NaOH treatment; (b) NaOH treated uncyclized library; and (c) NaOH treated cyclized library.

97

M - 50 M Ratio Bead Area Area (encoding) (inverted) (inverted/encoding) C1 1540.2 7022 1590.1 4623 0.66

C2 1523.4 2393 1573.3 566 0.24

C3 1360.4 5083 1410.4 4394 0.86

C4 1438.5 2436 1488.4 2060 0.85

C5 1365.5 5993 1415.4 877 0.15

C6 1579.4 1676 1629.3 692 0.41

C7 1294.5 8831 1344.5 693 0.08

C8 1493.1 3729 1543.1 2 0.00

C9 1510.5 3489 1560.4 131 0.04

C10 1435.6 5057 ND 0 0.00

C11 1435.6 4328 1485.5 3102 0.72

C12 1471.6 3243 1521.5 4484 1.38

C13 1558.5 10668 1608.4 6521 0.61

C14 1449.6 2160 1499.5 2535 1.17

C15 1416.6 5669 1466.5 172 0.03

C16 1614.5 5329 1664.4 153 0.03

C17 1441.5 8414 1491.5 2373 0.28

C18 1492.6 3681 1542.5 5483 1.49

C19 1525.5 6416 1575.4 1358 0.21

C20 1584.4 9229 1634.3 4670 0.51

C21 1529.3 2721 1579.3 2492 0.92

C22 1446.3 2479 1496.2 2413 0.97

C23 1531.2 4064 1581.2 309 0.08

C24 1634.2 2130 1684.1 1876 0.88

average 0.52

Table 3.1. Internal/Surface Peptide Ratio for 24 Randomly Selected Beads ND, Not detected by MALDI-TOF (threshold = signal/noise > 3). 98 3.2.3 Peptide Sequence Determination by PED

Fifty beads were randomly selected from library I and subjected to seven cycles of

PED [6, 7], which converted the encoding peptide on each bead into a series of

progressively shorter peptides (Figure 3a). The inverted peptide on the surface was not

affected by the PED reaction. Fourteen of the beads were randomly chosen and placed

into individual microcentrifuge tubes, and the peptides were released by cleavage with

CNBr and analyzed by MALDI-TOF MS. Out of the 14 beads, 13 (93%) produced

spectra of sufficient quality to allow unambiguous sequence assignment, while one bead failed to show any signals, possibly due to poor sample preparation. Figure 3b shows a representative MS spectrum derived from one of the 14 beads. The inverted peptide produced an intense peak at m/z 1386.7, whereas the full-length encoding peptide generated a peak at m/z 1336.7. The peptide fragments derived from PED gave a series of peaks at m/z 1180.6, 1109.6, 1038.5, 925.5, 868.5, 740.4, and 683.3. From the masses of this peptide ladder, the sequence of the encoding peptide was determined as

AAMGKGRGBBLLM-resin, indicating that the inverted peptide had the sequence of resin-MLLBBE-(hydroxymethylbenzoyl)-RAAMGKG-COOH. All together, we sequenced 423 beads in this work and obtained 376 unambiguous sequences (89% success rate) (Table 3.2). The rest of the beads (11%) either had poor ionization in MS or missed one or more of the ladder peaks, preventing complete sequence assignment.

99 a O O

HO NH HO NH O O

HO2C-X1X2X3X4X5AR N HO2C-X1X2X3X4X5AR N H O BBLLM H O BBLLM H2N-RAX5X4X3X2X1RGBBLLM 1) 40:1 PITC/Fmoc-OSU Fmoc-RAX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM 2) TFA H2N-AX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM H2N-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM

repeat 6x

O O

HO NH HO NH O O

HO2C-X1X2X3X4X5AR N HO2C-X1X2X3X4X5AR N H O BBLLM* H O BBLLM H2N-RAX5X4X3X2X1RGBBLLM* 1) piperidine Fmoc-RAX5X4X3X2X1RGBBLLM H2N-AX5X4X3X2X1RGBBLLM* Fmoc-AX5X4X3X2X1RGBBLLM H2N-X5X4X3X2X1RGBBLLM* 2) CNBr Fmoc-X5X4X3X2X1RGBBLLM MS H2N-X4X3X2X1RGBBLLM* Fmoc-X4X3X2X1RGBBLLM H2N-X3X2X1RGBBLLM* Fmoc-X3X2X1RGBBLLM H2N-X2X1RGBBLLM* Fmoc-X2X1RGBBLLM H2N-X1RGBBLLM* Fmoc-X1RGBBLLM H2N-RGBBLLM* H2N-RGBBLLM

b inverted * peptide Interior peptide = RAAMGKGRGBBLLM-resin 785.4

Surface peptide = resin-MLLBBE(Hmb)RAAMGKG-COOH 1386.7

* 1180.6 983.5 1336.7 868.5 1109.6 1038.5 925.5 740.4 683.3 G K G M A A R

600 700 800 900 1000 1100 1200 1300 1400 m/z

Figure 3.3. Partial Edman Degradation of Free C-Termini Peptide Library. (a) Partial Edman degradation of resin-bound peptide. PITC, phenylisothiocyanate; Fmoc-OSU, N-(9-fluorenylmethoxycarbonyloxy) succinimide; M*, homoserine lactone. (b) MALDI-TOF mass spectrum of the peptide and its degradation products from a randomly selected bead with the sequence RAAMGKG-CO2H. *, Peptide fragments N- terminally acylated with CD3CO- (Leu and Lys) or CH3CD2CO- group (Nle); they were derived from library synthesis to differentiate isobaric amino acid [7]. Hmb, p- hydroxymethylbenzoyl.

100

No. of beads analyzed No. of complete Trial no. Screening target by PED/MS sequences obtained

1 None 14 13 (93 %)

2 NHERF-PDZ1 84 67 (80 %)

3 CIPP-PDZ2 93 82 (88 %)

4 CIPP-PDZ3 85 80 (94 %)

5 CIPP-PDZ4 147 134 (91%)

Total (average) 423 376 (89%)

Table 3.2. Success Rate for Sequencing Resin-Bound Inverted Peptides by PED/MS.

101 3.2.4 Identification of Binding Ligands for PDZ Domains

We chose to test the validity of our library method on the PDZ1 domain of sodium-

hydrogen exchanger regulatory factor-1 (NHERF1). NHERF1 and its homologue

NHERF2 represent a family of adaptor proteins characterized by the presence of two N-

terminal PDZ domains and a C-terminal domain that binds to cytoskeleton proteins ezrin,

radixin, and moesin [142]. The PDZ domains of NHERF1 and NHERF2 bind to an array

of transmembrane and soluble proteins, including ion channels, transcription factors, and

cell surface receptors. A previous library study has shown that the NHERF1 PDZ1

domain binds to peptides of the consensus eX(T/s)(R/y)(L/f)-COOH, where X is any

amino acid and the lower-case letters represent less frequently selected amino acids [105].

To avoid any potential bias during library screening exerted by the positively charged

Arg at position –6 (relative to the C-terminal residue, which is defined as position 0) in

library I, we synthesized another library (library II), resin-MLLBBEAAX5X4X3X2X1-

COOH, in which the arginine was replaced by an alanine. Screening of a total of 20 mg

of library II (~60,000 beads) produced 84 positive beads, which were sequenced by

PED/MS to give 67 complete sequences (Table 3.3). Among these sequences, 57 clearly

belong to one class (class I) with a consensus motif of XX(T/s/c/h)(R/y)(F/L/M)-COOH

(Figure 3.4). Thus, our results are in excellent agreement with the earlier study, in which a genetically encoded peptide library was screened against the PDZ1 domain [105]. The

10 class II peptides each contained at least two histidine residues (Table 3.3). Since similar sequences were also obtained from the screening experiments against other PDZ domains and later binding assays showed that they had no detectable binding affinity to

102 the PDZ domains (vide infra), we conclude that the class II peptides were derived from

nonspecific binding of unknown origin.

30

20

No. of Occurrence of No. 10 0

0 -4 DENQHKRWFYML I VCTSAGP

Figure 3.4. Sequence Specificity of NHERF1 PDZ1 Domain. Displayed are the amino acids identified at each position (-4 to 0). Number of occurrence on the y axis represents the number of selected sequences that contained a particular amino acid at a certain position.

Class I Class II YDCRF YWGRF CRTYL LITHL IHHRM HHQHP LICRF WFHRF FNCRL DFTIL HWTAM RKHFH DWCRF YWHRF AGCRL YTCML WITKM VHGHR AHCRF LHTCF RRFRL QTARM YYCKM VHKYH NYERF YCHRF FPHRL KDCRM HTSRI TRHIH MSHYF YVHRF LHNRL LDIRM QYTRI LHNNH GNTYF HIKRF QDSRL QWSRM QCTRC HKHKL CSTYF WSNRF TTSRL VFHRM TRFRP YKHRH YHTSF TKNRF IGTRL VAHRM NQTRV MHYIH MRATF NWSRF QWSAL VYTRM KHFHT RISKF RHSHF FHTAL RHSVM LKTLF AFTYL RYTAL YSSYM Table 3.3. Selected Sequences for PDZ1 Domain from NHERF1 (67 total). C, (S)-2-aminobutyric acid; M, norleucine.

103 We next applied the library method to determine the sequence specificity of CIPP

(for channel-interacting PDZ domain protein) PDZ domains [107]. CIPP is a 612-amino

acid scaffolding protein containing four PDZ domains. It has been shown to bind to the

C-termini of several ion channel proteins and cell surface receptors [107, 143], but

otherwise little is known about the specificity of its PDZ domains. We expressed the four

PDZ domains individually in E. coli; however, only PDZ2, PDZ3, and PDZ4 domains

gave soluble proteins. Thus, library II (100 mg) was screened against each of the three

PDZ domains (at 0.5−1.0 μM), resulting in 82, 80, and 134 binding sequences

respectively (Tables 3. 4 – 3. 6). Inspection of the selected sequences reveal that PDZ2

and PDZ3 domains have similar specificities, with consensus sequences of

XX(S/h)(Y/R)V-COOH and XX(y/s/h/v/i/l/t)(Y/f/m/l)V-COOH, respectively (Figure 3.5).

A notable difference between these two domains is the preference for an Arg at the –2

position by PDZ2 but not by PDZ3 domain. PDZ4 domain has a very different specificity

profile. Unlike the stringent requirement for a Val at position 0 by PDZ2 and PDZ3

domains, PDZ4 domain accepts Ile, Leu, Val, and Nle at this position. On the other hand,

PDZ4 domain has a more stringent requirement for a Thr (and to a less extent Ser) at the

–2 position as compared to PDZ2 and PDZ3 domain (Figure 3.5). At positions –3 and –4,

PDZ4 domain has a general preference for hydrophobic residues, whereas PDZ2 and

PDZ3 domains prefer hydrophilic and small amino acids. Thus, PDZ4 domain has a

consensus of φφ(T/s)X(I/L/V/M), where φ represents hydrophobic amino acids.

Consistent with our findings, acid-sensitive ion channel 3 (ASIC3), which has a C- terminal sequence of LVTRL-COOH, binds selectively to the PDZ4 domain of CIPP

[143], whereas potassium ion channels Kir4.1 (which has a C-terminal sequence RISNV- 104 COOH) and Kir4.2 (QQSNV-COOH) and glutamate receptors NR2A (IESDV-COOH),

NR2B (IESDV-COOH), NR2C (LESEV-COOH), and NR2D (LESEV-COOH) bind to the PDZ2 and/or PDZ3 domains [107].

Class I Class II RYSRC HITRV RHAHC RHFHK HCSYC MKSRV HRHFF RHRHL IHSYC HHSRV IHRHF HSHRL RISYC HLSRV KHCHF YHFHR QISRI YLSRV* RHRHF YHHTR KFSYI TMSRV MHHRF HHFHT KHSYI VMSRV YHHRF HRHGV FRHAV KMTRV YKRHH IHRHV YKSCV FQHRV KRHMH HKHRV YRHCV WRSRV HGHRH HQHYV KISFV MSSRV RGHRH HRHYV RRFFV KFAYV RLHRH* RHKHY WRHHV HGSYV ILHTH WAHHV MHGYV HRHKI WHHAV IHNYV HRHVI IHHCV PHNYV HKHYI KRSIV KHTYV RYSIV KHYYV YFSKV RHYYV HGHMV FKNYV VHSMV MKTYV RYHMV LQHYV KYSMV MRHYV AAHRV IRWYV YASRV TWHYV VFHRV* KYHYV FISRV TRHFY Table 3.4. Peptide Sequences Selected against the PDZ2 Domain of CIPP (82 total). * Sequences selected for resynthesis and SPR analysis.

105

Class I Class II YSHYC KGVIV YHKYV RHCHC YPIFI RGILV HKMYV FRHCH KQYHI RASLV CHRYV LTHFH HAVYI CHYLV HGSYV IHRFH KGICV RCYLV LKSYV RSHKH KHCFV HRHMV MSSYV KRHQH MKHFV RGIMV RYSYV FVHRH RPHFV HAVMV RKSYV IIHRH PPLFV HGYMV YRSYV MMHRH RALFV RPSNV ICTYV HKHVH KAQFV KAIQV NCTYV HKHYH RHSFV RNTWV HGVYV RPHYH YPTFV CKYWV KKVYV* PHKHI RPFHV KHCYV GAYYV HKHYM WHSHV SRFYV KHYYV* AHAHR CIYHV CPHYV KAYYV HRHIR* RHLIV TCHYV NHYYV LHHMR RALIV TKHYV RVYYV RHFHV RHFHV KHLHV HMHRV HQHVV RHFHY IHRHY HKHIY HRHVY Table 3.5. Peptide Sequences Selected against the PDZ3 Domain of CIPP (80 total). *Sequences selected for resynthesis and SPR analysis.

106

Class I YWTAC YTTRI FYSRL WVSHV YYTNC IYTRI IATRL MVHHV WFTSC VISTI YFTRL* CYTHV WFTSC WITWI YVTRL HCTHV FVHHI VMTWI FFTSL WLTKV TYHHI IMSYI HFTSL YYTKV YIHHI VYSYI VMTSL MMSQV LISAI MCTYI WYTTL IFTQV WKTAI AHTYI ISTYL MHTQV CMTAI LLTYI MTTAM FFSRV HSTAI MRTYI MTTAM LISRV VCTCI AYTYI HCTFM MVTRV IRSFI VFTAL IHTFM WATSV VGTFI YLTAL AFSHM AITSV AHTFI GMTAL MISHM YHTWV DITFI MMTAL LSTHM DTTWV GMTFI TVTDL YVTHM YYTWV CISHI TWTDL HTTHM RFHYV IISHI CHTFL FLSKM HRHYV VVSHI NHTFL IITKM FCSYV ISTHI ICSHL WSHMM FISYV LTTHI KLSHL FITNM ILSYV VVTHI* IMSHL MFSRM HYSYV WHTHI WHSHL WMSRM WFSII ICTHL IVTRM Class II MLSKI LITHL WYTRM HMHSM IMSKI PMTHL VITYM HIHTM PITKI QMTHL LYTYM MHHHY VYTMI WHTIL RSRRR WCTNI RITKL LLSAV LHTNI YLTKL HSSFV LVTNI MSTML LTTFV YMTPI WHTPL LASHV HCTRI LCTQL VCSHV LHTRI LIHRL MFSHV MRTRI WMHRL WLSHV Table 3.6. Peptide Sequences Selected against the PDZ4 Domain of CIPP (134 total). *Sequences selected for resynthesis and SPR analysis.

107

Figure 3.5. Sequence Specificity of CIPP PDZ Domains. Displayed are the amino acids identified at each position (-4 to 0). Occurrence on the y axis represents the percentage of selected sequences that contained a particular amino acid at a certain position.

108

-4 CIPP-PDZ2 100 CIPP-PDZ3 80 CIPP-PDZ4 60 40

20 0 DENQHKRWFYML I VCTSAGP -3 100

80 60 40 20 0 DENQHKRWFYML I VCTSAGP

-2 100 80 60 40 20 0 DENQHKRWFYML I VCTSAGP Occurrence (%) Occurrence -1 100 80

60 40 20 0 DENQHKRWFYML I VCTSAGP

0 100 80 60 40

20 0 DENQHKRWFYML I VCTSAGP

Figure 3.5.

109 3.2.5 Binding Affinity between Selected Peptides and CIPP PDZ Domains

To confirm the on-bead screening results, two representative peptides selected

against each of the CIPP PDZ domains were synthesized and their dissociation constants

against the three PDZ domains were determined by surface plasmon resonance (SPR).

The PDZ domains were biotinylated and immobilized onto a streptavidin-coated

sensorchip. Binding analysis was carried out by flowing different concentrations of the

synthetic peptides (0−1500 μM) over the sensorchip. Two peptides selected against the

PDZ4 domain, YFTRL and VVTHI, bound to the PDZ4 domain with high affinities (KD

= 1.7 and 3.1 μM, respectively) but had no detectable binding toward the PDZ2 or PDZ3

domain (Table 3.7). Likewise, the two peptides selected against the PDZ3 domain

(KHYYV and KKVYV) bound selectively to the PDZ3 domain, although the binding

affinities (KD = 18 and 32 μM, respectively) were an order of magnitude lower than that

of PDZ4 and its cognate ligand (see Figure 3.6 for actual sensorgrams). Interestingly, the

peptides selected against the PDZ2 domain (VFHRV and YLSRV) bound weakly to all

three PDZ domains (KD = 70−2400 μM). Finally, we tested two of the class II peptides

(RLHRH and HRHIR) and found that they had no significant binding affinity for any of the PDZ domains. It has previously been reported that PDZ domains generally have lower affinity to their cognate ligands, as compared to other modular domains (e.g., Src homology 2 domains) and our results are well within the range of KD values reported earlier [101, 115, 116, 122].

110 a

400 400 channel 1 channel 2 300 streptavidin 300 CIPP-PDZ2

200 200

100 100

Resonance Units 0 Resonance Units 0

0 50 100 150 200 250 0 50 100 150 200 250 time (s) time (s)

400 400 channel 3 channel 4 300 CIPP-PDZ3 300 CIPP-PDZ4

200 200

100 100

Resonance Units 0 Resonance Units 0

0 50 100 150 200 250 0 50 100 150 200 250 time (s) time (s) b 300

250

200

K =19μM 150 D RUeq 100

50

0 0 10203040506070 [Ac-YAA-KHYYV-OH] (μM)

Figure 3.6. SPR Analysis of the Binding Interaction between Immobilized PDZ Domains and Peptide Ac-YAAKHYYV-OH (which was selected against the PDZ3 domain of CIPP). (a) Overlaid sensorgrams of four different flow channels injected with increasing concentrations of the peptide (1, 5, 10, 20, 40, 60 µM) for 2 min at a flow rate of 5 µL/min. (b) Secondary plot of RUeq of the PDZ3 domain as a function of the peptide concentration. The data were fitted to the equation RUeq = RUmax[peptide]/(KD + [peptide]), where RUmax is the maximum response unit.

111

KD (μM) peptidea PDZ2 PDZ3 PDZ4 YFTRL-OH NB NB 1.7 ± 0.1b VVTHI-OH NB NB 3.1 ± 1.7b KHYYV-OH NB 19 ± 1b NB KKVYV-OH NB 32 ± 2b NB VFHRV-OH 2400 ± 1200b 580 ± 60 1100 ± 400 YLSRV-OH 630 ± 540b 1200 ± 460 70 ± 40 RLHRH-OH NB NB NB HRHIR-OH NB NB NB

Table 3.7. Dissociation Constants (KD, μM) of Selected Peptides toward CIPP PDZ Domains. aAll peptides contained an Ac-YAA or Ac-YSS at their N-termini. bThe PDZ domain to which the peptide was selected; NB, no significant binding was detected.

112 3.2.6 Database Search of Potential CIPP-Binding Proteins

We searched Protein Information Resource database (website:

http://pir.georgetown.edu/) for potential CIPP-binding proteins. We restricted our search

to mouse proteins for simplicity. For the PDZ2 domain, motif [HTS][RFY][V]> (where

“>” restricts the search to protein C-terminus) was used for the pattern search. Similarly,

motifs [HYLIVTS] [HFYLI] [V]> and [GAVLIMPFYW] [GAVLIMPFYW] [ST] X

[IVML]> were used for the PDZ3 and PDZ4 domains, respectively. These searches

recovered 15, 41, and 104 potential targets for the PDZ2 (Table 3.8), PDZ3 (Table 3.9), and PDZ4 (Table 3.10), respectively, after eliminating any redundant, fragment, or hypothetical proteins. Among the predicted targets for PDZ2 and PDZ3 domains, neuroligin 3 and Nrxn3 proteins have previously been shown to interact with the PDZ domains [107]. For the PDZ4 domain, as expected, acid sensitive ion channel 3, a known

PDZ4 domain-binding protein [143], was recovered from the database search. It is highly

probable that some of the other predicted binding proteins, especially the cell surface

receptors, adhesion molecules, and channel proteins, will prove to be bona fide CIPP-

interacting proteins. A few previously reported CIPP-binding proteins (e.g., glutamate

receptors [107]) were not recovered by the database search, because their C-terminal

sequences contain an acidic amino acid at the –1 position (Asp or Glu), which were not

preferred by the PDZ2 or PDZ3 domain. On the basis of our results, we predict that the

C-termini of these proteins (IESDV and LESEV) would bind only very weakly to CIPP

PDZ2 and/or PDZ3 domains. It is possible that CIPP is engaged in additional interactions

with more upstream regions of the receptor proteins. Alternatively, the interaction

113 between CIPP and its partner proteins may involve oligomerization of the binding partners.

C-terminal ID no. Protein Sequence CCSFV Q9JKM7 Ras-related protein Rab-37 EESFV Q05CF3 Insc protein Tyrosine-protein phosphatase non-receptor type 18 (EC 3.1.3.48) (Fetal EWTRV Q61152 liver phosphatase 1) (FLP-1) (PTP-K1) FLSFV Q8K4E5 2'-5' oligoadenylate synthetase 2 FSTYV A2CG52 Kalirin, RhoGEF kinase KPSRV Q8CIV5 Testis-specific leucine zipper protein nurit LITRV A2RSL2 Microrchidia 4 PGTYV Q3TEL6 RING finger protein 157 PNTFV Q61121 Probable G-protein coupled receptor 19 RITFV Q3UDI8 minichromosome maintenance deficient 7 SHHYV Q8BKW4 Zinc finger CCHC domain-containing protein 4 STSYV Q8BXA6 Claudin-17 STTRV A2AGI2 Neuroligin 3* VDTRV Q925N4 Claudin-16 (Paracellin-1) YNTFV Q8K2K3 E130014J05Rik protein Table 3.8. Potential CIPP-PDZ2 Binding Proteins from Database Search. * Proteins that have been shown to interact with corresponding PDZ domains [143]

C-terminal ID no. Protein Sequence AAVFV Q4QQL3 Cathepsin O AKVFV P58404 Striatin-4 (Zinedin) EESFV Q05CF3 Insc protein EFIFV Q6EDY6 CARMIL Ras association domain-containing protein 8 (Carcinoma-associated EGIYV Q8CJ96 protein HOJ-1 homolog) NHP2-like protein 1 (High mobility group-like nuclear protein 2 homolog ERLLV Q9D0T1 1) (U4/U6.U5 tri-snRNP 15.5 kDa protein) (Sperm-specific antigen 1) (Fertilization antigen 1) (FA-1) FHTLV Q9D0P8 Putative GTP-binding protein RAY-like (Rab-like protein 4) FLSFV Q8K4E5 2'-5' oligoadenylate synthetase 2 FSTYV A2CG53 Kalirin, RhoGEF kinase (continued)

Table 3.9. Potential CIPP-PDZ3 Binding Proteins from Database Search. * Proteins that have been shown to interact with corresponding PDZ domains [143] 114 Table 3.9. (continued),

C-terminal ID no. Protein Sequence HWLLV Q61327 Sodium-dependent dopamine transporter (DA transporter) (DAT) IASIV Q7TRG2 Olfactory receptor Olfr845 (Olfactory receptor 845) ATP-binding cassette sub-family A member 1 (ATP-binding cassette KESYV P41233 transporter 1) (ATP-binding cassette 1) (ABC-1)

KEYYV Q6P9K9 Nrxn3 protein* KGIHV Q80TY4 Suppression of tumorigenicity protein 18 Metalloreductase STEAP3 (EC 1.16.1.-) (Six-transmembrane epithelial KTSHV Q8CI59 antigen of prostate 3) (Tumor suppressor-activated pathway protein 6) (Dudulin-2) (Protein nm1054) KVSHV Q8C739 Protein FAM110B LQILV A2ANS3 Heat shock protein 12B UDP-N-acetyl-alpha-D-galactosamine:polypeptide N- LWLFV Q0VE84 acetylgalactosaminyltransferase 6 Beta-2-syntrophin (59 kDa dystrophin-associated protein A1 basic MGLLV Q61235 component 2) (Syntrophin-3) (SNT3) (Syntrophin-like) (SNTL) 60S ribosomal protein L13a (Transplantation antigen P198) (Tum-P198 NGLLV P19253 antigen) NVTLV Q7TSN7 Cell adhesion molecule JCAM PASHV Q06054 Zinc finger protein Rho GTPase-activating protein 6 (Rho-type GTPase-activating protein PETLV O54834 RhoGAPX-1) PGSLV Q148W7 Lnx2 protein PGTYV Q3TEL6 RING finger protein 157 PNTFV Q61121 Probable G-protein coupled receptor 19 QGTLV P01741 Ig heavy chain V region (Anti-arsonate antibody) QKILV A2ASU5 Olfactory receptor 1022 QTSIV Q80WG6 UROP11-110 QYSFV Q91ZM3 Pro-rich, PH and SH2 domain-containing signaling mediator alpha RAVYV Q91VT9 Pvrl2 protein RFHIV A3KFV4 Procollagen type XVI alpha 1 RGHLV Q925S5 Nephrin C-type lectin domain family 4, member a1 (Dendritic cell inhibitory RQLYV Q80UI7 receptor 4) SHHYV Q8BKW4 Zinc finger CCHC domain-containing protein 4 SISHV P14404 Ecotropic virus integration site 1 protein (EVI-1) SRLLV Q9D5W4 Coiled-coil domain-containing protein 81 SRSLV Q3TES0 IQ motif and Sec7 domain-containing protein 3 TFLFV Q99MV4 Testis protein TEX20 Killer cell lectin-like receptor subfamily B member 1F (Natural killer cell TLIHV Q8VD98 surface protein NKR-P1F) (CD161f antigen) Coiled-coil domain-containing protein 135 (Spermatogenesis related gene VKVFV Q6V3W6 in late stages of spermatogenesis cells protein) (SRG-L)

115 C-terminal ID no. Protein Sequence AASSL Q9CWY8 Ribonuclease H2 subunit A (EC 3.1.26.4) (RNase H2 subunit A) AGSCL Q921T2 Torsin-1A-interacting protein 1 (Lamina-associated polypeptide 1B) AGTKL P01674 Ig kappa chain V-III region PC 2154 AISQL Q14DP8 Slitrk2 protein Phosphoglycerate kinase 2 (EC 2.7.2.3) (Phosphoglycerate kinase, testis ALSNM P09041 specific) ALSNV P09411 Phosphoglycerate kinase 1 (EC 2.7.2.3) AVSLL Q9WUH5 Tripartite motif-containing protein 10 (RING finger protein 9) Potassium/sodium hyperpolarization-activated cyclic nucleotide-gated FASNL O88704 channel 1 FGSCI Q8BRU4 C-type lectin domain family 9 member A FLSFV Q8K4E5 2'-5' oligoadenylate synthetase 2 FLSKM Q8VFZ8 Olfactory receptor MOR114-6 (Olfactory receptor Olfr781) FVSAM Q8R2I0 Forkhead box protein E1 (Thyroid transcription factor 2) (TTF-2) FYTTL A2RSB9 Uroplakin 1A Protein phosphatase 1 regulatory subunit 1A (Protein phosphatase GASLV Q9ERT9 inhibitor 1) (IPP-1) Likely ortologue to human doublesex-and mab-3-related transcription GASRM A2ALC3 factor C1 (Dmrtc1) GFSLI Q6PAV2 Probable E3 ubiquitin-protein ligase HERC4 (EC 6.3.2.-) GGSGL P97815 Tumor metastasis associated gene product GGSQL A2AT09 Syntaphilin (Snph protein) GGTSI A6H6F2 Vomeronasal 2, receptor, 10 GLTAL Q9CR40 Kelch-like protein 28 (BTB/POZ domain-containing protein 5) Iroquois-class homeodomain protein IRX-5 (Iroquois homeobox protein GMSDI Q9JKQ4 5) Pleckstrin homology domain containing, family H (With MyTH4 GPTLL Q059P2 domain) member 2 GPTLV Q9CY86 Protein G7d GPTPV Q18PH6 CD84-H1 GYSFM Q810R5 Actg2 protein IATII A5GZX3 Glyoxalase I (EC 4.4.1.5) IFSMI Q32MT0 Mefv protein IFTDV P16390 Potassium voltage-gated channel subfamily A member 3 IPSIL Q14B80 Kcnc2 protein IPSSI P06798 Homeobox protein Hox-A4 (Hox-1.4) (Homeobox protein MH-3) IPTGI Q3ZAW3 Neuropeptide Y receptor Y6 IYTKI Q3UAP7 S-adenosylmethionine synthetase (EC 2.5.1.6) LATKL Q8CJD0 Anion transporter/exchanger-5 SLC26A6B LFSDV Q7TRG6 Olfactory receptor Olfr836 LFSPL Q8K4R9 Disks large-associated protein DLG7 (Discs large homolog 7) LISQV Q8C729 Protein FAM126B LITCI Q6DIA2 SEC6-like protein C14orf73 homolog (continued) Table 3.10. Potential CIPP-PDZ4 Binding Proteins from Database Search. *Proteins that have previously been shown to interact with the PDZ4 domain [143]

116 Table 3.10. (continued),

C-terminal ID no. Protein Sequence LITRV A2RSL2 Microrchidia 4 LLSGI Q4H437 Epsilon-Sarcoglycan splicing variant 3B LLSGL Q8VDK4 Cadherin 13 LLSKV Q91V64 Isochorismatase domain-containing protein 1 LLTDV P16388 Potassium voltage-gated channel subfamily A member 1 LLTGL Q9DCV4 Protein FAM82B LLTQM Q8K2D7 3321401G04Rik protein LMTAL A2ALK7 Erythrocyte protein band 4.1-like 4b LMTEL A2ALK6 Erythrocyte protein band 4.1-like 4b LPSIL Q63959 Potassium voltage-gated channel subfamily C member 3 Potassium/sodium hyperpolarization-activated cyclic nucleotide-gated LPSNL O70507 channel 4 LPSRL P25688 Uricase (EC 1.7.3.3) (Urate oxidase) LPTPL A2AA61 DNA segment, Chr 11, ERATO Doi 175, expressed LPTRL O70293 G protein-coupled receptor kinase 6 (EC 2.7.11.16) LVSAL Q0VBR6 Kringle containing transmembrane protein 2 Amiloride-sensitive cation channel 3 (Acid-sensitive ion channel 3) LVTRL Q6X1Y6 (ASIC3)* LVTSL Q0P6A0 Lphn2 protein LYTGV Q8C407 Protein YIPF4 (YIP1 family member 4) MASRL Q924J9 Six-transmembrane epithelial antigen of the prostate MGTSL A2A6D8 CDK5 regulatory subunit associated protein 3 MLTDV P63141 Potassium voltage-gated channel subfamily A member 2 MLTEV Q61923 Potassium voltage-gated channel subfamily A member 6 MMSTV P42225 Signal transducer and activator of transcription 1 MVGSL A2AL05 Ortholog of human G protein-coupled receptor 51 GPR51 MVTEV Q17ST2 Voltage-dependent potassium channel Kv1.7 PASHV Q06054 Zinc finger protein PASIM A2A823 Novel protein containing a Pyridoxal-dependent decarboxylase PASKV Q8CDG5 UPF0474 protein PATVV Q91UZ1 Phospholipase C beta 4 (Plcb4) PFSHL Q5PPQ9 Reps1 protein PFSTL Q05B39 Nuclear factor 1 PGSLV Q148W6 Ligand of numb-protein X 2 PGTCI A2AP43 Attractin PGTLV Q920B0 FERM domain-containing protein 4B (GRP1-binding protein GRSP1) PGTYV Q3TEL6 RING finger protein 157 PLGSL Q8BW00 Probable peptidyl-tRNA hydrolase (EC 3.1.1.29) (PTH) PLSHM P48432 Transcription factor SOX-2 PLSQV A2RTK9 Frizzled homolog 8 (Drosophila) PLTHI Q80XF2 Transcriptional factor SOX1 PLTQV Q9QZB9 Dynactin subunit 5 (Dynactin subunit p25)

(continued) 117 Table 3.10. (continued),

C-terminal ID no. Protein Sequence PMTKL Q0ZM18 Protocadherin-15-CD3 isoform 3 PPSLL A2ASP9 Agrin Ubiquitin carboxyl-terminal hydrolase 2 (EC 3.1.2.15) (Ubiquitin PPSRM O88623 thioesterase 2) PPSSI O70561 Small proline-rich protein 2J PPTKV Q1MXF9 Sodium channel beta4 subunit PVSVM Q9Z2V9 Cyclin-I PVTHL P43267 SOX-15 protein RalBP1-associated Eps domain-containing protein 2 (RalBP1-interacting PVTVL Q80XA6 protein 2) VFSIV Q8C5M4 programmed cell death 6 VFTSM Q640P5 Trpm1 protein VGSPL A2A667 Melanoma nuclear protein 13 VGTDL Q8K2N9 Annexin A8 VGTEL A2AC98 G protein-coupled receptor 142 VLSEL Q6A070 Uncharacterized protein KIAA0423 Aquaporin 4 M23A splice variant ad/1 (Aquaporin 4 M23A splice VLSSV Q50H70 variant ab/1) D-dopachrome decarboxylase (EC 4.1.1.84) (D-dopachrome VMTFL O35215 tautomerase) VMTMV Q7TN08 Dapper2 WLTAL Q08EC1 WD repeat domain 75 Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N- YGTEL Q5JCT0 acetylglucosaminyltransferase 3 YISSV Q8CGR8 SPEER 2 YLSAV Q05CS3 Cyp39a1 protein YMTNV A2A9X9 Novel protein YPTCV Q8R0I8 BC026782 protein YVSHL 0VAZ2 progestin and adipoQ receptor family member III YVSNL Q0PCR6 Voltage-dependent calcium channel splice variant 1.2-a YVSTM Q8K1C9 Leucine-rich repeat-containing protein 41 (Protein Muf1) YVTDI Q4QQN1 Chd5 protein

118 3.3 Discussion

We have now developed a general methodology to synthesize and screen OBOC combinatorial peptide libraries containing free C-termini and demonstrated its validity by determining the recognition motifs of four different PDZ domains. The new method was built upon three previously developed concepts/techniques: peptide inversion to generate support-bound peptides with free C-termini, spatial segregation of microbeads to separate library molecules from encoding tags, and high-throughput peptide sequencing by

PED/MS. Compared to the biological methods, our method has the advantage of being able to accommodate posttranslationally modified amino acids and unnatural building blocks. This feature will be very useful for screening other proteins that require modified amino acids for binding/ or designing metabolically stable inhibitors against

PDZ domains. Unlike the other chemical methods such as SPOT synthesis [116, 117], our method has no theoretical limit on the library size and therefore is able to determine the specificity profile of a protein/enzyme in a systematic manner. It provides individual binding sequences, from which a consensus sequence(s) is readily derived. Furthermore, our method does not require any special equipment, is straightforward to carry out, and relatively inexpensive.

Our method should be readily applicable to any protein/enzyme that recognizes the

C-terminus of a peptide or protein. In addition to PDZ domains, 14-3-3 proteins have been shown to recognize the C-termini of their partner proteins [97]. Unlike the PDZ domains, however, binding by the 14-3-3 proteins strictly requires phosphorylation of Ser or Thr at the C-termini [98]. To our knowledge, little is currently known about the specificity of these proteins (in terms of C-terminal binding) and no library method has 119 been developed for this class of proteins. There are also a large number of enzymes that

catalyze the modification of protein C-terminus, including C-terminal specific proteases

[83, 84] and C-terminal lipidation enzymes [85, 86]. Application of our library method to

these challenging targets is already underway in this laboratory.

3.4 Conclusion

In conclusion, we have developed an effective method for synthesizing and

sequencing free-C terminal peptide libraries. We have demonstrated our method in the search for the sequence specificities of four PDZ domains. This method would be applicable for sequence specificities of PDZ domains and other C-terminal specific proteins with posttranslational modifications. Along with cyclic peptide libraries introduced in chapter 2, this method allows different mode of peptide display on the resin support and this should further expand the utility of resin-bound peptide libraries.

3.5 Experimental Sections

3.5.1 Materials

Fmoc-protected L-amino acids and other reagents for peptide synthesis were

purchased from Advanced ChemTech (Louisville, KY), Peptides International (Louisville,

KY), or NovaBiochem (La Jolla, CA). Phenyl isothiocyanate (PITC) was purchased in 1-

mL sealed ampoules from either Sigma (St. Louis, MO) and a freshly opened ampoule

was used in each experiment. Streptavidin-alkaline phosphatase (SA-AP) conjugate (~1

mg/mL) was purchased from Prozyme (San Leandro, CA). 5-Bromo-4-chloro-3-indolyl

phosphate (BCIP) was from Sigma. Nα-Fmoc-Glu(δ-N-hydroxysuccinimidyl)-O-

120 CH2CH=CH2 was prepared as previously described [138]. DNA plasmids containing

NHERF1 PDZ1 domain (pGEX-NHERF1-PDZ1) and CIPP protein (pcDNA-CIPP) were generous gifts from Dr. Mike Zhu of The Ohio State University. All oligonucleotides

were custom synthesized at Integrated DNA Technologies (Coralville, IA).

3.5.2 Synthesis of Inverted Peptide Libraries

The inverted peptide library was synthesized on 2.0 g of TentaGel S NH2 resin (90

μm, 0.29 mmol/g). All of the manipulations were performed at room temperature unless otherwise noted. The linker sequence (BBLLM) was synthesized with 2 equiv of Fmoc- amino acids, using HBTU/HOBt/N-methylmorpholine (NMM) as the coupling reagents.

The coupling reaction was typically allowed to proceed for 1 h and the beads were washed with DMF (3x) and DCM (3x). The Fmoc group was removed by treatment twice with 20% piperidine in DMF (5 + 15 min) and the beads were exhaustively washed with

DCM, DMF, and again with DCM (3x each). To spatially segregate the beads into outer and inner layers, the resin was treated with 20% piperidine in DMF (5 + 15 min), washed with DMF and water, and soaked in water overnight. The resin was drained and

α suspended in a solution of N -Fmoc-Glu(δ-N-hydroxysuccinimidyl)-O-CH2CH=CH2

(0.19 mmol, 0.33 equiv) in 30 mL of 55:45 (v/v) DCM/diethyl ether. The mixture was vigorously shaken briefly and incubated on a rotary shaker for 30 min at room temperature. The beads were washed with 55:45 DCM/diethyl ether (3x) and DMF (8x) to remove water from the beads and then treated with 2 equivalents of Boc-Gly-OH plus

HBTU/HOBt/NMM in DMF (45 min). Next, the HMBA linker (2 equiv) was added to the surface layer using standard Fmoc/HBTU chemistry. The Boc group on the encoding

121 peptide was then removed with the treatment of TFA (1 h), and an Arg was added to the

N-terminus by treatment with Fmoc-Arg(Pbf)-OH/HBTU/HOBt/NMM. To synthesize

the random region, the resin was split into 20 equal portions and each portion (100 mg)

was coupled twice with 4 equiv of a different Fmoc-amino acid. The first random residue

was coupled with diisopropylcarbodiimide/4,4-diaminopyridine in DCM for 6 h, while

the other positions were coupled with HBTU/HOBt/NMM for 2 h. To differentiate

isobaric amino acids during MS sequencing, 5% (mol/mol) CD3CO2D was added to the

coupling reactions of Lys and Leu, and 5% CH3CD2CO2D was added to the reaction of

Nle [7]. After the five random positions, a dipeptide Arg-Ala (or Ala-Ala) was added to

the N-terminus of all peptides. Next, the allyl group on the C-terminal glutamate was

removed by overnight treatment with a solution containing

tetrakis(triphenylphosphine)palladium (1 equiv), triphenylphosphine (3 equiv), formic

acid (10 equiv), and diethylamine (10 equiv) in anhydrous THF. The beads were washed

sequentially with 0.5% diisopropylethylamine in DMF, 0.5% sodium dimethyldithiocarbamate hydrate in DMF, DMF (3x), DCM (3x), and DMF (3x). The N-

terminal Fmoc group was then removed in 20% piperidine and the beads were washed

with DMF (6x), 1 M HOBt in DMF (3x), DMF (3x), and DCM (3x). For peptide

cyclization, a solution of PyBOP/HOBt/NMM (5, 5, 10 equiv, respectively) in DMF was

mixed with the resin and the mixture was incubated on a rotary shaker for 3 h. The resin

was washed with DMF (3x) and DCM (3x) and dried under vacuum for >1 h. The resin

was treated with 1 M NaOH for 1 hr, followed by side-chain deprotection with a

modified reagent K (6.5% phenol, 5% water, 5% thioanisole, 2.5% ethanedithiol, 1%

122 anisole, and 1% triisopropylsilane in TFA) for 1.5 h. The resin was washed with TFA and

DCM, and dried under vacuum before storage at -20 °C.

3.5.3 Peptide Sequencing by PED/MS

Selected resin beads (anywhere from 1 to ~100 beads) were pooled and subjected to partial Edman degradation in a single reaction vessel as described previously [5-7]. The

beads were suspended in 160 μL of pyridine/water (v/v 2:1) containing 0.1% triethylamine and mixed with an equal volume of the pyridine containing Fmoc-OSU (2.5

μmol), and PITC (100 μmol). The reaction was allowed to proceed for 6 min with mixing

and the beads were washed with DCM, and TFA. The beads were treated twice with

~300 μL of TFA for 6 min each. After washing the resin with DCM, pyridine, and the

cycle was repeated for 6 times. After the final cycle, the Fmoc group was removed by

treatment with 20% piperidine in DMF twice (5 + 15 min). For MALDI-TOF analysis,

the degraded beads were treated with ~1 mL of TFA containing ammonium iodide (10

mg) and dimethylsulfide (20 μL) on ice for 30 min to reduce any oxidized Met. After

washing with water, the beads were transferred into microcentrifuge tubes (1 bead/tube)

and each treated with 20 μL of 70% TFA containing CNBr (40 mg/mL) overnight in the

dark. The solvents were evaporated under vacuum to dryness and the peptides released

from the bead were dissolved in 5 μL of 0.1% TFA in water. One μL of the peptide

solution was mixed with 2 μL of saturated 4-hydroxy-α-cyanocinnamic acid in

acetonitrile/0.1% TFA (1:1) and 1 μL of the mixture was spotted onto a MALDI sample

plate. Mass spectrometry was performed at Campus Chemical Instrument Center of The 123 Ohio State University on a Bruker III MALDI-TOF instrument in an automated manner.

The data obtained were analyzed by either Moverz software (Proteometrics LLC,

Winnipeg, Canada) or Bruker Daltonics flexAnalysis 2.4 (Bruker Daltonic GmbH,

Germany).

3.5.4 Recombinant DNA Constructs

The DNA sequence encoding each of the four CIPP PDZ domains was amplified

from plasmid pcDNA-CIPP by the polymerase chain reaction (PCR). The PCR primers

for subcloning were 5’-C GAA TTC CAT ATG GGA GAA CTA CAC ATT ATC GAA

CTG G-3’ and 5’-G GGA TCC GTC GAC CTG ACT GAC TGC ATC TTC GTT TC-3’

for PDZ1 domain; 5’- C GAA TTC CAT ATG GGC CAG GAA ATG ATC ATA GAA

ATA TCC-3’ and 5’-G GGA TCC GTC GAC CTG TGC TTC ATC TCT GTA TAC

CAC C-3’ for PDZ 2 domain; 5’-C GAA TTC CAT ATG GAC GAG GAG AAC TTG

GAG GTG-3’ and 5’-C ATC GTG GTC GAC GGA ACC GGC GCG GAG TCT-3’for

PDZ 3 domain; 5’-C GAA TTC CAT ATG ACA GAG GAG GAA CCA AGG ACT-3’

and 5’-C ATC GTG GTC GAC CTG GGT CGC TAT GGC ACT TAT-3’ for PDZ 4

domain. The PCR products were digested with restriction enzymes NdeI and SalI and ligated into an engineered pET-22b(+) vector that contains a ybbR tag for the specific labeling with phosphopantetheinyl transferase Sfp [144, 145]. All DNA constructs were confirmed by dideoxy sequencing at Genewiz (South Plainfield, NJ) or Plant Microbe

Genomics Facility of The Ohio State University.

124 3.5.5 Purification and Biotinylation of PDZ domains

E. coli BL21(DE3) cells harboring the proper expression vector were grown in LB

media to the mid-log phase and induced by the addition of 100-150 μM isopropyl β-D- thiogalactoside for 6 h at 30 oC. The cells were harvested by centrifugation and lysed in a

lysis buffer (sodium phosphate 50 mM, pH 7.4, imidazole 5 mM, 300 mM NaCl for CIPP

PDZ domains; Tris 100 mM, pH 8.0, NaCl 150 mM, EDTA 1 mM, DTT 1mM for GST-

PDZ1) by sonication in the presence of protease inhibitors (PMSF 3.5 mg, trypsin

inhibitor 2 mg, peptstain A 0.1 mg per 100 mL). The GST-PDZ1 fusion protein (for

NHERF1) was purified from the crude cell lysate on a glutathione-agarose column, and

the eluted protein was dialyzed against phosphate-buffered saline (pH 7.4). To label the

protein with biotin, 2 mg of protein (~ 0.7 mL) was incubated with 2 equiv of (+)-Biotin

N-hydroxysuccinimide ester (Sigma) in 0.1 M NaHCO3 (pH 8.3−8.5) for 30 min. The

unreacted NHS ester was quenched by the addition of 50 μL of 1 M Tris buffer (pH 8.5).

The CIPP PDZ domains, which contained C-terminal ybbR and six-histidine tags, were

purified by metal-affinity chromatography and enzymatically labeled with biotin by using

phosphopantetheinyl transferase Sfp and a biotin-CoA adduct [144, 145]. The labeled

proteins were passed through a G-25 column (eluted with phosphate-buffered saline) to

remove any unlabeled biotin. Protein concentration was determined by the Bradford

method or UV absorbance at 280 nm.

125 3.5.6 Peptide Library Screening

In a MicroBio-Spin column (0.8 mL, Bio-Rad), 50 mg of the peptide library was

swollen in DCM and extensively washed with DMF and then water. The resin was

incubated in the HBST-gelatin buffer (30 mM Hepes, pH 7.4, 150 mM NaCl, 0.05%

Tween 20, and 0.1% gelatin) 2 h to overnight with gentle mixing at 4 °C. The resin was

drained, resuspended in 800 μL of the same buffer containing 0.5-1 μM biotinylated PDZ

domain protein, and incubated overnight at 4 °C. The resin was drained and resuspended

in SA-AP buffer (30 mM Tris-HCl, pH 7.4, 250 mM NaCl, 10 mM MgCl2, and 70 μM

ZnCl2) containing SA-AP (1 μg/mL) for 15 min incubation at room temperature with

gentle mixing on the roller wheel. The resin was washed with 800 μL of SA-AP buffer

(2x), 800 μL of HBST-gelatin buffer (3x), and 800 μL of SA-AP reaction buffer (30 mM

Tris-HCl, pH 8.5, 100 mM NaCl, 5 mM MgCl2, 20 μM ZnCl2) (3x). The resin was

transferred into a single well of a 12-well plate (BD Falcon) by rinsing with 3 x 300 μL of the SA-AP reaction buffer. Upon the addition of 100 μL of 5 mg/mL BCIP in the SA-

AP reaction buffer, intense turquoise color developed on positive beads in ~30 min, when

the staining was quenched by the addition of 100 μL of 1 M HCl. The resin was washed

extensively with water and transferred into a 35-mm Petri dish, from which the positive

beads were picked manually with a pipette under a dissecting microscope. The screening

procedure was repeated once with 50 mg of the peptide library.

126 3.5.7 Synthesis of Individual PDZ-Binding Peptides

Each peptide was synthesized on 100 mg of Wang resin (0.8 mmol/g) with standard

Fmoc chemistry. A tripeptide (Ac-YAA or Ac-YSS) was added to their N-terminus to

facilitate concentration determination. After cleavage and deprotection, the identity of the

peptides was confirmed by MALDI-TOF mass spectrometry. Analytical HPLC analysis

(C18 column) indicated that each peptide had > 80% purity except peptide VFHRV-OH

(70%). The peptide concentration was determined by measuring UV absorbance at 280

nm.

3.5.8 Determination of Dissociation Constants by SPR

The binding affinity of the individual peptides to CIPP PDZ domains was

determined by surface plasmon resonance analysis on a BIAcore 3000. Assays were

performed in HBS-EP buffer (10 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA,

0.005% polysorbate 20) at 25 °C. Biotinylated PDZ domains were immobilized on a

sensor chip SA coated with streptavidin. The chip was first conditioned with an activation

buffer containing 1 M NaCl and 50 mM NaOH following the manufacturer’s instructions.

Injection of the biotinylated PDZ protein (~20 μg/mL) was continued until there was no

further increase in the signal (~3500-6500 RU increase over background). For KD measurements, varying concentrations of the peptides (0.5−1000 μM) dissolved in HBS-

EP buffer were passed over the chip for 2 min at a flow rate of 5 μL/min. Flow cell 1 on

the sensorchip was not loaded with any PDZ protein and was used as the blank. In

between runs, complete regeneration of the surface was achieved without the use of strip

127 buffer in most cases; when necessary, the sensorchip was regenerated by injecting a strip

buffer (0.001 % SDS in HBS-EP buffer) for 5−10 s at a flow rate of 50−100 μL/min. The

equilibrium response unit (RUeq) at a given peptide concentration was obtained by

subtracting response of the blank flow cell from the response of the flow cell containing the PDZ domain. The dissociation constant (KD) was obtained by nonlinear regression

fitting of the data to the equation: RUeq = RUmax x [peptide]/(KD + [peptide]), where

RUmax is the maximum response unit.

128

Bibliography BIBLIOGRAPHY

1. Lam, K. S., Salmon, S. E., Hersh, E. M., Hruby, V. J., Kazmierski, W. M., and Knapp, R. J. (1991) A new type of synthetic peptide library for identifying ligand- binding activity. Nature 354, 82-84

2. Furka, A., Sebestyen, F., Asgedom, M., and Dibo, G. (1991) General method for rapid synthesis of multicomponent peptide mixtures. Int J Pept Protein Res 37, 487-493

3. Houghten, R. A., Pinilla, C., Blondelle, S. E., Appel, J. R., Dooley, C. T., and Cuervo, J. H. (1991) Generation and use of synthetic peptide combinatorial libraries for basic research and drug discovery. Nature 354, 84-86

4. Sweeney, M. C., Wavreille, A. S., Park, J., Butchar, J. P., Tridandapani, S., and Pei, D. (2005) Decoding protein-protein interactions through combinatorial chemistry: sequence specificity of SHP-1, SHP-2, and SHIP SH2 domains. Biochemistry 44, 14932-14947

5. Wang, P., Arabaci, G., and Pei, D. (2001) Rapid sequencing of library-derived peptides by partial edman degradation and mass spectrometry. J Comb Chem 3, 251-254

6. Sweeney, M. C., and Pei, D. (2003) An improved method for rapid sequencing of support-bound peptides by partial edman degradation and mass spectrometry. J Comb Chem 5, 218-222

7. Thakkar, A., Wavreille, A. S., and Pei, D. (2006) Traceless capping agent for peptide sequencing by partial edman degradation and mass spectrometry. Anal Chem 78, 5935-5939

8. Garaud, M., and Pei, D. (2007) Substrate profiling of protein tyrosine phosphatase PTP1B by screening a combinatorial peptide library. J Am Chem Soc 129, 5366- 5367

129 9. Hotchkiss, R. D. (1941) The Chemical Nature of Gramicidin and Tyrocidine. J Biol Chem 141, 171-185

10. Synge, R. L. M. (1945) 'Gramicidn S': Over-all Chemical Characteristics and Amino-Acid Composition. Biochem J 39, 363-367

11. Edman, P. (1959) Chemistry of amino acids and peptides. Annu Rev Biochem 28, 69-96

12. Horton, D. A., Bourne, G. T., and Smythe, M. L. (2002) Exploring privileged structures: the combinatorial synthesis of cyclic peptides. J Comput Aided Mol Des 16, 415-430

13. Rizo, J., and Gierasch, L. M. (1992) Constrained peptides: models of bioactive peptides and protein substructures. Annu Rev Biochem 61, 387-418

14. Dubos, R. J., and Cattaneo, C. (1939) Studies On A Bactericidal Agent Extracted From A Soil Bacillus : Iii. Preparation And Activity Of A Protein-Free Fraction. J Exp Med 70, 249-256

15. Hotchkiss, R. D., and Dubos, R. J. (1940) Fractionation Of The Bactericidal Agent From Cultures Of A Soil Bacillus. J Biol Chem 132, 791-792

16. Hotchkiss, R. D., and Dubos, R. J. (1941) The Isolation Of Bactericidal Substances From Cultures Of Bacillus Brevis. J Biol Chem 141, 155-162

17. Hodgkin, D. C., and Oughton, B. M. (1957) Possible molecular models for gramicidin S and their relationship to present ideas of protein structure. Biochem J 65, 752-756

18. Hull, S. E., Karlsson, R., Main, P., Woolfson, M. M., and Dodson, E. J. (1978) The Crystal Structure of a Hydrated Gramicidin S-Urea Complex. Nature 275, 206-207

19. Kuo, M. C., and Gibbons, W. A. (1980) Nuclear Overhauser effect and cross- relaxation rate determinations of dihedral and transannular interproton distances in the decapeptide tyrocidine A. Biophys J 32, 807-836

20. Zasloff, M. (2002) Antimicrobial peptides of multicellular organisms. Nature 415, 389-395

21. Prenner, E. J., Lewis, R. N., and McElhaney, R. N. (1999) The interaction of the antimicrobial peptide gramicidin S with lipid bilayer model and biological membranes. Biochim Biophys Acta 1462, 201-221

130 22. Straus, S. K., and Hancock, R. E. (2006) Mode of action of the new antibiotic for Gram-positive pathogens : comparison with cationic antimicrobial peptides and lipopeptides. Biochim Biophys Acta 1758, 1215-1223

23. Matsuzaki, K. (1999) Why and how are peptide-lipid interactions utilized for self- defense? Magainins and tachyplesins as archetypes. Biochim Biophys Acta 1462, 1-10

24. Hancock, R. E., and Lehrer, R. (1998) Cationic peptides: a new source of . Trends Biotechnol 16, 82-88

25. Yonezawa, H., Okamoto, K., Kaneda, M., Tominaga, N., and Izumiya, N. (1983) Studies of peptide antibiotics. XLIV. Syntheses and biological activities of gramicidin S analogs containing delta-hydroxy-L-norvaline or glycine. Int J Pept Protein Res 22, 573-581

26. Kohli, R. M., Walsh, C. T., and Burkart, M. D. (2002) Biomimetic synthesis and optimization of cyclic peptide antibiotics. Nature 418, 658-661

27. Qin, C., Bu, X., Zhong, X., Ng, N. L., and Guo, Z. (2004) Optimization of antibacterial cyclic decapeptides. J Comb Chem 6, 398-406

28. Qin, C., Zhong, X., Bu, X., Ng, N. L., and Guo, Z. (2003) Dissociation of antibacterial and hemolytic activities of an amphipathic peptide antibiotic. J Med Chem 46, 4830-4833

29. Bu, X., Wu, X., Xie, G., and Guo, Z. (2002) Synthesis of tyrocidine A and its analogues by spontaneous cyclization in aqueous solution. Org Lett 4, 2893-2895

30. Giroux, R. (2005) Cyclosporine. Chem. Eng. News 83

31. Wenger, R. M. (1984) Synthesis of cyclosporine. Total syntheses of cyclosporin A and cyclosporin H, two fungal metabolites isolated from the species Tolypocladium inflatum Gams. Helv Chim Acta 67, 502-525

32. Aramburu, J., Yaffe, M. B., Lopez-Rodriguez, C., Cantley, L. C., Hogan, P. G., and Rao, A. (1999) Affinity-driven peptide selection of an NFAT inhibitor more selective than cyclosporin A. Science 285, 2129-2133

33. Gehlsen, K. R., Argraves, W. S., Pierschbacher, M. D., and Ruoslahti, E. (1988) Inhibition of in vitro tumor cell invasion by Arg-Gly-Asp-containing synthetic peptides. J Cell Biol 106, 925-930

34. Humphries, M. J., Olden, K., and Yamada, K. M. (1986) A synthetic peptide from fibronectin inhibits experimental metastasis of murine melanoma cells. Science 233, 467-470

131 35. Gurrath, M., Muller, G., Kessler, H., Aumailley, M., and Timpl, R. (1992) Conformation/activity studies of rationally designed potent anti-adhesive RGD peptides. Eur J Biochem 210, 911-921

36. Noiri, E., Gailit, J., Sheth, D., Magazine, H., Gurrath, M., Muller, G., Kessler, H., and Goligorsky, M. S. (1994) Cyclic RGD peptides ameliorate ischemic acute renal failure in rats. Kidney Int 46, 1050-1058

37. Koivunen, E., Gay, D. A., and Ruoslahti, E. (1993) Selection of peptides binding to the alpha 5 beta 1 integrin from phage display library. J Biol Chem 268, 20205- 20210

38. Brooks, P. C., Montgomery, A. M., Rosenfeld, M., Reisfeld, R. A., Hu, T., Klier, G., and Cheresh, D. A. (1994) Integrin alpha v beta 3 antagonists promote tumor regression by inducing apoptosis of angiogenic blood vessels. Cell 79, 1157-1164

39. Dechantsreiter, M. A., Planker, E., Matha, B., Lohof, E., Holzemann, G., Jonczyk, A., Goodman, S. L., and Kessler, H. (1999) N-Methylated cyclic RGD peptides as highly active and selective alpha(V)beta(3) integrin antagonists. J Med Chem 42, 3033-3040

40. Hariharan, S., Gustafson, D., Holden, S., McConkey, D., Davis, D., Morrow, M., Basche, M., Gore, L., Zang, C., O'Bryant, C. L., Baron, A., Gallemann, D., Colevas, D., and Eckhardt, S. G. (2007) Assessment of the biological and pharmacological effects of the alpha nu beta3 and alpha nu beta5 integrin receptor antagonist, cilengitide (EMD 121974), in patients with advanced solid tumors. Ann Oncol 18, 1400-1407

41. Athanassiou, Z., Dias, R. L., Moehle, K., Dobson, N., Varani, G., and Robinson, J. A. (2004) Structural mimicry of retroviral tat proteins by constrained beta-hairpin peptidomimetics: ligands with high affinity and selectivity for viral TAR RNA regulatory elements. J Am Chem Soc 126, 6906-6913

42. Athanassiou, Z., Patora, K., Dias, R. L., Moehle, K., Robinson, J. A., and Varani, G. (2007) Structure-guided peptidomimetic design leads to nanomolar beta- hairpin inhibitors of the Tat-TAR interaction of bovine immunodeficiency virus. Biochemistry 46, 741-751

43. Smith, G. P. (1985) Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 228, 1315-1317

44. O'Neil, K. T., Hoess, R. H., Jackson, S. A., Ramachandran, N. S., Mousa, S. A., and DeGrado, W. F. (1992) Identification of novel peptide antagonists for GPIIb/IIIa from a conformationally constrained phage peptide library. Proteins 14, 509-515

45. Smith, G. P., and Petrenko, V. A. (1997) Phage Display. Chem Rev 97, 391-410 132 46. Perler, F. B. (2005) Protein splicing mechanisms and applications. IUBMB Life 57, 469-476

47. Perler, F. B., Davis, E. O., Dean, G. E., Gimble, F. S., Jack, W. E., Neff, N., Noren, C. J., Thorner, J., and Belfort, M. (1994) Protein splicing elements: inteins and exteins--a definition of terms and recommended nomenclature. Nucleic Acids Res 22, 1125-1127

48. Scott, C. P., Abel-Santos, E., Wall, M., Wahnon, D. C., and Benkovic, S. J. (1999) Production of cyclic peptides and proteins in vivo. Proc Natl Acad Sci U S A 96, 13638-13643

49. Horswill, A. R., Savinov, S. N., and Benkovic, S. J. (2004) A systematic method for identifying small-molecule modulators of protein-protein interactions. Proc Natl Acad Sci U S A 101, 15591-15596

50. Tavassoli, A., and Benkovic, S. J. (2005) Genetically selected cyclic-peptide inhibitors of AICAR transformylase homodimerization. Angew Chem Int Ed Engl 44, 2760-2763

51. Roberts, R. W., and Szostak, J. W. (1997) RNA-peptide fusions for the in vitro selection of peptides and proteins. Proc Natl Acad Sci U S A 94, 12297-12302

52. Millward, S. W., Takahashi, T. T., and Roberts, R. W. (2005) A general route for post-translational cyclization of mRNA display libraries. J Am Chem Soc 127, 14142-14143

53. Millward, S. W., Fiacco, S., Austin, R. J., and Roberts, R. W. (2007) Design of Cyclic Peptides That Bind Protein Surfaces with Antibody-Like Affinity. ACS Chem. Biol. 2, 625-634

54. Blackburn, C., and Kates, S. A. (1997) Solid-phase synthesis of cyclic homodetic peptides. Methods Enzymol 289, 175-198

55. Bloomberg, G. B., Askin, D., Gargaro, A. R., and Tanner, M. J. A. (1993) Synthesis of a branched cyclic peptide using a strategy employing Fmoc chemistry and two additional orthogonal protecting groups. Tetrahedron Lett. 34, 4709-4712

56. Zang, X., Yu, Z., and Chu, Y. H. (1998) Tight-binding streptavidin ligands from a cyclic peptide library. Bioorg Med Chem Lett 8, 2327-2332

57. Graves, P. R., and Haystead, T. A. (2002) Molecular biologist's guide to proteomics. Microbiol Mol Biol Rev 66, 39-63; table of contents

58. Siegel, M. M., Huang, J., Lin, B., Tsao, R., and Edmonds, C. G. (1994) Structures Of A And Isolated Congeners: Sequencing Of Cyclic Peptides With 133 Blocked Linear Side Chains By Electrospray Ionization Mass Spectrometry. Bio. Mass Spectrom. 23, 186-204

59. Eckert, K., Schwarz, H., Tomer, K. B., and Gross, M. L. (1985) Tandem mass spectrometry methodology for the sequence determination of cyclic peptides. J Am Chem Soc 107, 6765-6769

60. Ngoka, L. C., and Gross, M. L. (1999) Multistep tandem mass spectrometry for sequencing cyclic peptides in an ion-trap mass spectrometer. J Am Soc Mass Spectrom 10, 732-746

61. Schilling, B., Wang, W., McMurray, J. S., and Medzihradszky, K. F. (1999) Fragmentation and sequencing of cyclic peptides by matrix-assisted laser desorption/ionization post-source decay mass spectrometry. Rapid Commun Mass Spectrom 13, 2174-2179

62. Redman, J. E., Wilcoxen, K. M., and Ghadiri, M. R. (2003) Automated mass spectrometric sequence determination of cyclic peptide library members. J Comb Chem 5, 33-40

63. Chen, J. K., Lane, W. S., Brauer, A. W., Tanaka, A., and Schreiber, S. L. (1993) Biased combinatorial libraries: novel ligands for the SH3 domain of phosphatidylinositol 3-kinase. J Am Chem Soc 115, 12591-12592

64. Songyang, Z., Shoelson, S. E., Chaudhuri, M., Gish, G., Pawson, T., Haser, W. G., King, F., Roberts, T., Ratnofsky, S., Lechleider, R. J., Neel, B. G., Birge, R. B., Fajardo, J. E., Chou, M. M., Hanafusa, H., Schaffhausen, B., and Cantley, L. C. (1993) SH2 Domains Recognize Specific Phosphopeptide Sequences. Cell 72, 767-778

65. Muller, K., Gombert, F. O., Manning, U., Grossmuller, F., Graff, P., Zaegel, H., Zuber, J. F., Freuler, F., Tschopp, C., and Baumann, G. (1996) Rapid identification of phosphopeptide ligands for SH2 domains. Screening of peptide libraries by fluorescence-activated bead sorting. J Biol Chem 271, 16500-16505

66. Dooley, C. T., Ny, P., Bidlack, J. M., and Houghten, R. A. (1998) Selective ligands for the mu, delta, and kappa opioid receptors identified from a single mixture based tetrapeptide positional scanning combinatorial library. J Biol Chem 273, 18848-18856

67. Smith, H. K., and Bradley, M. (1999) Comparison of resin and solution screening methodologies in combinatorial chemistry and the identification of a 100 nM inhibitor of trypanothione reductase. J Comb Chem 1, 326-332

68. Peng, L., Liu, R. W., Marik, J., Wang, X. B., Takada, Y., and Lam, K. S. (2006) Combinatorial chemistry identifies high-affinity peptidomimetics against alpha4 beta1 integrin for in vivo tumor imaging. Nature Chem. Biol 2, 381-389 134 69. Wu, J., Ma, Q. N., and Lam, K. S. (1994) Identifying substrate motifs of protein kinases by a random library approach. Biochemistry 33, 14825-14833

70. Matthews, D. J., and Wells, J. A. (1993) Substrate phage: selection of protease substrates by monovalent phage display. Science 260, 1113-1117

71. Meldal, M., Svendsen, I., Breddam, K., and Auzanneau, F.-I. (1994) Portion- Mixing Peptide Libraries of Quenched Fluorogenic Substrates for Complete Subsite Mapping of Endoprotease Specificity. Proc Natl Acad Sci U S A 91, 3314-3318

72. Peterson, J. J., and Meares, C. F. (1998) Cathepsin substrates as cleavable peptide linkers in bioconjugates, selected from a fluorescence quench combinatorial library. Bioconjug Chem 9, 618-626

73. Hu, Y. J., Wei, Y., Zhou, Y., Rajagopalan, P. T., and Pei, D. (1999) Determination of substrate specificity for peptide deformylase through the screening of a combinatorial peptide library. Biochemistry 38, 643-650

74. Harris, J. L., Backes, B. J., Leonetti, F., Mahrus, S., Ellman, J. A., and Craik, C. S. (2000) Rapid and general profiling of protease specificity by using combinatorial fluorogenic substrate libraries. Proc Natl Acad Sci U S A 97, 7754-7759

75. Francis, M. B., Jamison, T. F., and Jacobsen, E. N. (1998) Combinatorial libraries of transition-metal complexes, catalysts and materials. Curr Opin Chem Biol 2, 422-428

76. Miller, S. J., Copeland, G. T., Papaioannou, N., Horstmann, T. E., and Ruel, E. M. (1998) Kinetic Resolution of Alcohols Catalyzed by Tripeptides Containing the N-Alkylimidazole Substructure. J Am Chem Soc 120, 1629-1630

77. Copeland, G. T., Jarvo, E. R., and Miller, S. J. (1998) Minimal Acylase-Like Peptides. Conformational Control of Absolute Stereospecificity. J Org Chem 63, 6784-6785

78. Berkessel, A., and Herault, D. A. (1999) Discovery of peptide-zirconium complexes that mediate phosphate hydrolysis by batch screening of a combinatorial undecapeptide library. Angew Chem Int Ed 38, 102-105

79. Hoveyda, A. H. (1998) Catalyst discovery through combinatorial chemistry. Chem Biol 5, R187-R191

80. Copeland, G. T., and Miller, S. J. (2001) Selection of enantioselective acyl transfer catalysts from a pooled peptide library through a fluorescence-based activity assay: an approach to kinetic resolution of secondary alcohols of broad structural scope. J Am Chem Soc 123, 6496-6502

135 81. Thieriet, N., Guibe, F., and Albericio, F. (2000) Solid-phase peptide synthesis in the reverse (N --> C) direction. Org Lett 2, 1815-1817

82. Alsina, J., and Albericio, F. (2003) Solid-phase synthesis of C-terminal modified peptides. Biopolymers 71, 454-477

83. Keiler, K. C., Waller, P. R., and Sauer, R. T. (1996) Role of a peptide tagging system in degradation of proteins synthesized from damaged messenger RNA. Science 271, 990-993

84. Pallen, M. J., and Wren, B. W. (1997) The HtrA family of serine proteases. Mol Microbiol 26, 209-221

85. Zhang, F. L., and Casey, P. J. (1996) Protein prenylation: molecular mechanisms and functional consequences. Annu Rev Biochem 65, 241-269

86. Cook, T. A., Ghomashchi, F., Gelb, M. H., Florio, S. K., and Beavo, J. A. (2000) Binding of the delta subunit to rod phosphodiesterase catalytic subunits requires methylated, prenylated C-termini of the catalytic subunits. Biochemistry 39, 13516-13523

87. Anderson, D., Koch, C. A., Grey, L., Ellis, C., Moran, M. F., and Pawson, T. (1990) Binding of SH2 domains of phospholipase C gamma 1, GAP, and Src to activated growth factor receptors. Science 250, 979-982

88. Blaikie, P., Immanuel, D., Wu, J., Li, N., Yajnik, V., and Margolis, B. (1994) A region in Shc distinct from the SH2 domain can bind tyrosine-phosphorylated growth factor receptors. J Biol Chem 269, 32031-32034

89. Nourry, C., Grant, S. G., and Borg, J. P. (2003) PDZ domain proteins: plug and play! Sci STKE 2003, RE7

90. Hung, A. Y., and Sheng, M. (2002) PDZ domains: structural modules for protein complex assembly. J Biol Chem 277, 5699-5702

91. Morais Cabral, J. H., Petosa, C., Sutcliffe, M. J., Raza, S., Byron, O., Poy, F., Marfatia, S. M., Chishti, A. H., and Liddington, R. C. (1996) Crystal structure of a PDZ domain. Nature 382, 649-652

92. Doyle, D. A., Lee, A., Lewis, J., Kim, E., Sheng, M., and MacKinnon, R. (1996) Crystal structures of a complexed and peptide-free membrane protein-binding domain: molecular basis of peptide recognition by PDZ. Cell 85, 1067-1076

93. Moore, B. W., and Perez, V. J. (1967) Specific acidic proteins of the nervous system. In Physiological and Biochemical Aspects of Nervous Integration (Carlson, F. D., ed) pp. 343-359, Prentice-Hall, Englewood Cliffs, NJ

136 94. Celis, J. E., Gesser, B., Rasmussen, H. H., Madsen, P., Leffers, H., Dejgaard, K., Honore, B., Olsen, E., Ratz, G., Lauridsen, J. B., and et al. (1990) Comprehensive two-dimensional gel protein databases offer a global approach to the analysis of human cells: the transformed amnion cells (AMA) master database and its link to genome DNA sequence data. Electrophoresis 11, 989-1071

95. Xiao, B., Smerdon, S. J., Jones, D. H., Dodson, G. G., Soneji, Y., Aitken, A., and Gamblin, S. J. (1995) Structure of a 14-3-3 protein and implications for coordination of multiple signalling pathways. Nature 376, 188-191

96. Yaffe, M. B., Rittinger, K., Volinia, S., Caron, P. R., Aitken, A., Leffers, H., Gamblin, S. J., Smerdon, S. J., and Cantley, L. C. (1997) The structural basis for 14-3-3:phosphopeptide binding specificity. Cell 91, 961-971

97. Coblitz, B., Wu, M., Shikano, S., and Li, M. (2006) C-terminal binding: an expanded repertoire and function of 14-3-3 proteins. FEBS Lett 580, 1531-1535

98. Ganguly, S., Weller, J. L., Ho, A., Chemineau, P., Malpaux, B., and Klein, D. C. (2005) Melatonin synthesis: 14-3-3-dependent activation and inhibition of arylalkylamine N-acetyltransferase mediated by phosphoserine-205. Proc Natl Acad Sci U S A 102, 1222-1227

99. Shikano, S., Coblitz, B., Sun, H., and Li, M. (2005) Genetic isolation of transport signals directing cell surface expression. Nat Cell Biol 7, 985-992

100. Jespers, L. S., Messens, J. H., De Keyser, A., Eeckhout, D., Van den Brande, I., Gansemans, Y. G., Lauwereys, M. J., Vlasuk, G. P., and Stanssens, P. E. (1995) Surface expression and ligand-based selection of cDNAs fused to filamentous phage gene VI. Biotechnology (N Y) 13, 378-382

101. Fuh, G., Pisabarro, M. T., Li, Y., Quan, C., Lasky, L. A., and Sidhu, S. S. (2000) Analysis of PDZ domain-ligand interactions using carboxyl-terminal phage display. J Biol Chem 275, 21486-21491

102. Laura, R. P., Witt, A. S., Held, H. A., Gerstner, R., Deshayes, K., Koehler, M. F., Kosik, K. S., Sidhu, S. S., and Lasky, L. A. (2002) The Erbin PDZ domain binds with high affinity and specificity to the carboxyl termini of delta-catenin and ARVCF. J Biol Chem 277, 12906-12914

103. Cull, M. G., Miller, J. F., and Schatz, P. J. (1992) Screening for receptor ligands using large libraries of peptides linked to the C terminus of the lac repressor. Proc Natl Acad Sci U S A 89, 1865-1869

104. Stricker, N. L., Christopherson, K. S., Yi, B. A., Schatz, P. J., Raab, R. W., Dawes, G., Bassett, D. E., Jr., Bredt, D. S., and Li, M. (1997) PDZ domain of neuronal nitric oxide synthase recognizes novel C-terminal peptide sequences. Nat Biotechnol 15, 336-342 137 105. Wang, S., Raab, R. W., Schatz, P. J., Guggino, W. B., and Li, M. (1998) Peptide binding consensus of the NHE-RF-PDZ1 domain matches the C-terminal sequence of cystic fibrosis transmembrane conductance regulator (CFTR). FEBS Lett 427, 103-108

106. Fields, S., and Song, O. (1989) A novel genetic system to detect protein-protein interactions. Nature 340, 245-246

107. Kurschner, C., Mermelstein, P. G., Holden, W. T., and Surmeier, D. J. (1998) CIPP, a novel multivalent PDZ domain protein, selectively interacts with Kir4.0 family members, NMDA receptor subunits, neurexins, and neuroligins. Mol Cell Neurosci 11, 161-172

108. Stryer, L., and Haugland, R. P. (1967) Energy transfer: a spectroscopic ruler. Proc Natl Acad Sci U S A 58, 719-726

109. You, X., Nguyen, A. W., Jabaiah, A., Sheff, M. A., Thorn, K. S., and Daugherty, P. S. (2006) Intracellular protein interaction mapping with FRET hybrids. Proc Natl Acad Sci U S A 103, 18458-18463

110. Songyang, Z., Fanning, A. S., Fu, C., Xu, J., Marfatia, S. M., Chishti, A. H., Crompton, A., Chan, A. C., Anderson, J. M., and Cantley, L. C. (1997) Recognition of unique carboxyl-terminal motifs by distinct PDZ domains. Science 275, 73-77

111. Songyang, Z., Blechner, S., Hoagland, N., Hoekstra, M. F., Piwnica-Worms, H., and Cantley, L. C. (1994) Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr Biol 4, 973-982

112. Kania, R. S., Zuckermann, R. N., and Marlowe, C. K. (1994) Free C-terminal Resin-Bound Peptides: Reversal of Peptide Orientation via A Cylization/Cleavage Protocol. J Am Chem Soc 116, 8835-8836

113. Davies, M., Bonnat, M., Guillier, F., Kilburn, J. D., and Bradley, M. (1998) Screening an inverted peptide library in water with a guanidinium-based tweezer receptor. J Org Chem 63, 8696-8703

114. Davies, M. (1997) C-terminally modified peptides and peptide libraries-another end to peptide synthesis. Angew Chem Int Ed Engl 36, 1097-1099

115. Wiedemann, U., Boisguerin, P., Leben, R., Leitner, D., Krause, G., Moelling, K., Volkmer-Engert, R., and Oschkinat, H. (2004) Quantification of PDZ domain specificity, prediction of ligand affinity and rational design of super-binding peptides. J Mol Biol 343, 703-718

116. Hoffmu"ller, U., Russwurm, M., Kleinjung, F., Ashurst, J., Oschkinat, H., Volkmer-Engert, R., Koesling, D., and Schneider-Mergener, J. (1999) Interaction 138 of a PDZ Protein Domain with a Synthetic Library of All Human Protein C Termini. Angew Chem Int Ed 38, 2000-2004

117. Boisguerin, P., Leben, R., Ay, B., Radziwill, G., Moelling, K., Dong, L., and Volkmer-Engert, R. (2004) An improved method for the synthesis of cellulose membrane-bound peptides with free C termini is useful for PDZ domain binding studies. Chem Biol 11, 449-459

118. Frank, R. (2002) The SPOT-synthesis technique. Synthetic peptide arrays on membrane supports--principles and applications. J Immunol Methods 267, 13-26

119. Fodor, S. P., Read, J. L., Pirrung, M. C., Stryer, L., Lu, A. T., and Solas, D. (1991) Light-directed, spatially addressable parallel chemical synthesis. Science 251, 767-773

120. Pease, A. C., Solas, D., Sullivan, E. J., Cronin, M. T., Holmes, C. P., and Fodor, S. P. (1994) Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci U S A 91, 5022-5026

121. MacBeath, G., and Schreiber, S. L. (2000) Printing proteins as microarrays for high-throughput function determination. Science 289, 1760-1763

122. Stiffler, M. A., Chen, J. R., Grantcharova, V. P., Lei, Y., Fuchs, D., Allen, J. E., Zaslavskaia, L. A., and MacBeath, G. (2007) PDZ domain binding selectivity is optimized across the mouse proteome. Science 317, 364-369

123. Stiffler, M. A., Grantcharova, V. P., Sevecka, M., and MacBeath, G. (2006) Uncovering quantitative protein interaction networks for mouse PDZ domains using protein microarrays. J Am Chem Soc 128, 5913-5922

124. Wipf, P. (1995) Synthetic studies of biologically active marine cyclopeptides. Chem Rev 105, 4441-4482

125. Hamada, Y., and Shioiri, T. (2005) Recent progress of the synthetic studies of biologically active marine cyclic peptides and depsipeptides. Chem Rev 105, 4441-4482

126. Spatola, A. F., Crozet, Y., deWit, D., and Yanagisawa, M. (1996) Rediscovering an endothelin antagonist (BQ-123): A self-deconvoluting cyclic pentapeptide library. Journal of Medicinal Chemistry 39, 3842-3846

127. Yu, Z., and Chu, Y.-H. (1997) Combinatorial epitope search: pitfalls of library design. Bioorg Med Chem Lett 7, 95-98

128. Liu, R. M., Jan; Lam, Kit S. (2002) A Novel Peptide-Based Encoding System for "One-Bead One-Compound" Peptidomimetic and Small Molecule Combinatorial Libraries. Journal of the American Chemical Society 124, 7678-7680 139 129. Lambert, J. N., Mitchell, J. P., and Roberts, K. D. (2001) The synthesis of cyclic peptides. J Chem Soc, Perkin Trans 1, 471-484

130. Davies, J. S. (2003) The cyclization of peptides and depsipeptides. J Pept Sci 9, 471-501

131. Li, P., Roller, P. P., and Xu, J. (2002) Current synthetic approaches to peptide and peptidomimetic cyclization. Curr Org Chem 6, 411-440

132. Vagner, J., Barany, G., Lam, K. S., Krchnak, V., Sepetov, N. F., Ostrem, J. A., Strop, P., and Lebl, M. (1996) Enzyme-mediated spatial segregation on individual polymeric support beads: application to generation and screening of encoded combinatorial libraries. Proc Natl Acad Sci U S A 93, 8194-8199

133. Ohlmeyer, M. H., Swanson, R. N., Dillard, L. W., Reader, J. C., Asouline, G., Kobayashi, R., Wigler, M., and Still, W. C. (1993) Complex synthetic chemical libraries indexed with molecular tags. Proc Natl Acad Sci U S A 90, 10922-10926

134. Devlin, J. J., Panganiban, L. C., and Devlin, P. E. (1990) Random peptide libraries: a source of specific protein binding molecules. Science 249, 404-406

135. Giebel, L. B., Cass, R. T., Milligan, D. L., Young, D. C., Arze, R., and Johnson, C. R. (1995) Screening of cyclic peptide phage libraries identifies ligands that bind streptavidin with high affinities. Biochemistry 34, 15430-15435

136. Doleckova-Maresova, L., Pavlik, M., Horn, M., and Mares, M. (2005) De novo design of alpha-amylase inhibitor: a small linear mimetic of macromolecular proteinaceous ligands. Chem Biol 12, 1349-1357

137. Ono, K., and Smith, E. E. (1986) Purification of glucoamylase by acarbose (BAY g-5421) affinity chromatography. Biotechnol Appl Biochem 8, 201-209

138. Joo, S. H., Xiao, Q., Ling, Y., Gopishetty, B., and Pei, D. (2006) High-throughput sequence determination of cyclic Peptide library members by partial edman degradation/mass spectrometry. J Am Chem Soc 128, 13000-13009

139. Xiao, Q., and Pei, D. (2007) High-throughput synthesis and screening of cyclic Peptide antibiotics. J Med Chem 50, 3132-3137

140. Wilkins, M. R., Gasteiger, E., Bairoch, A., Sanchez, J. C., Williams, K. L., Appel, R. D., and Hochstrasser, D. F. (1999) Protein identification and analysis tools in the ExPASy server. Methods Mol Biol 112, 531-552

141. Basdevant, N., Weinstein, H., and Ceruso, M. (2006) Thermodynamic basis for promiscuity and selectivity in protein-protein interactions: PDZ domains, a case study. J Am Chem Soc 128, 12766-12777

140 142. Weinman, E. J., Hall, R. A., Friedman, P. A., Liu-Chen, L. Y., and Shenolikar, S. (2006) The association of NHERF adaptor proteins with g protein-coupled receptors and receptor tyrosine kinases. Annu Rev Physiol 68, 491-505

143. Anzai, N., Deval, E., Schaefer, L., Friend, V., Lazdunski, M., and Lingueglia, E. (2002) The multivalent PDZ domain-containing protein CIPP is a partner of acid- sensing ion channel 3 in sensory neurons. J Biol Chem 277, 16655-16661

144. Yin, J., Straight, P. D., McLoughlin, S. M., Zhou, Z., Lin, A. J., Golan, D. E., Kelleher, N. L., Kolter, R., and Walsh, C. T. (2005) Genetically encoded short peptide tag for versatile protein labeling by Sfp phosphopantetheinyl transferase. Proc Natl Acad Sci U S A 102, 15815-15820

145. Wavreille, A. S., Garaud, M., Zhang, Y., and Pei, D. (2007) Defining SH2 domain and PTP specificity by screening combinatorial peptide libraries. Methods 42, 207-219

141