SYNTHETIC COMBINATORIAL PEPTIDE LIBRARIES AND THEIR APPLICATION IN DECODING BIOLOGICAL INTERACTIONS
DISSERTATION
Presented in the Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University
By
Michael Cameron Sweeney
The Ohio State University 2005
Dissertation Committee: Approved By:
Professor Dehua Pei, Advisor
Professor Jill Rafael-Fortney ______
Professor Charle Bell Dehua Pei
Professor Charles Brooks The Ohio State Biochemistry Graduate Program
ABSTRACT
The synthesis of peptides was revolutionized by the adoption of solid-phase synthetic techniques. Subsequent improvement, evolution, and refinement of this chemical technique has allowed research into areas of biology not previously accessible with such speed and breadth. Because of the efficiency and flexibility of the chemistry involved in peptide synthesis, libraries representing millions of unique natural, modified, or unnatural peptides can be constructed rapidly and in high enough purity as to obviate the need for purification. In this work, libraries were synthesized for screening against individual protein domains in an effort to both determine the preferred peptidyl binding partner types for each, as well as to establish an optimized, broadly applicable methodology for screening other domains. One of the problems encountered during the development of the screening methodology was the low success-rate of sequence determination for the peptides selected by each domain. Herein we report the successful modification of the peptide ladder mass spectrometry sequencing technique referred to as partial Edman degradation (PED). Success-rates were improved to greater than 90% for full-length sequencing determination of peptide up to 8-mers, even for more difficult phosphotyrosine (pY)-containing peptides. As a result of this improvement, three pY- binding Src Homology 2 (SH2) domains and two N-terminus binding Baculoviral
ii Inhibitor-of-Apoptosis Repeat (BIR) domains were screened against their respective libraries and the preferred ligand types for each was determined. The advantage of sequencing by the PED method became especially clear in the case of the N-terminal
SH2 (N-SH2) domain of Src Homology 2 Protien Tyrosine Phosphatase 2 (SHP-2) as previously unidentified sub-classes of binding consensus motifs were distinguishable due to the discreet nature of the sequencing technique. This work demonstrates the usefulness and potential generality of peptide library screening by this method.
iii
Dedicated to my parents and family
iv ACKNOWLEDGEMENTS
I would like to thank my advisor, Dr. Dehua Pei. I don’t know which was more important during my training, his broad scientific knowledge, surpassed in breadth only by its depth, or his nearly limitless patience. I benefited greatly from both and will be endebted throughout my career.
I am happy to have known and worked with my labmates, Dr. Kirk Beebe, Dr.
Peng Wang, Dr. Kiet Nguyen, Dr. Xubo Hu, Dr. Grace Zhu, Junguk Park, and Anne-
Sophie Wavreille. Thank you all for everything you have given me, I can only hope to have positively impacted you as much as you have me.
While at Ohio State, I was lucky enough to have had access to the professional support of the CCIC Mass Spectrometry Facility. The insight and timely assistance provided by Nan Kleinholz, Rhonda Pitsch, Ben Jones, and Josh Ellis under the direction of Dr. Kari Green-Church were indispensable. In addition to their technical expertise, their friendship made visits less about sample submission and more of a welcome reprieve from the daily grind.
Lastly, I am eternally grateful for the unwavering support I have received from my parents and family. Without you, I would never have tasted success in anything. If I can live up to your precedents in work and in life, I will know success in both.
v VITA
1997 B.S. Chemistry, The Ohio State University
1997-1999 Completed Years 1 and 2 of Medical School, The Ohio State University
1999-present Graduate Research Assistant, The Ohio State University
July, 2005 Return to Medical School, The Ohio State University
PUBLICATIONS
Sweeney, M. C. and Pei, D. (2003) An improved method for rapid sequencing of support-
bound peptides by partial Edman degradation and mass spectrometry. J. Comb.
Chem. 5, 218-222.
Sweeney, M. C., Park, J., Wavreille, A-S., and Pei, D. (2005) Decoding protein-protein
interactions through combinatorial chemistry: Sequence specificity of SHP-2 and
SHIP SH2 domains. Biochemistry, under revision.
Sweeney, M. C., Park, J., Wavreille, A-S., and Pei, D. (2005) Determination of the
binding specificities of the BIR domains of XIAP by combinatorial peptide
library screening. Chem. Biol., under review.
FIELDS OF STUDY
Major Field: Ohio State Biochemistry Program
vi TABLE OF CONTENTS
Page
Abstract...... ii Dedication...... iv Acknowledgments...... v Vita & Publications...... vi List of Schemes...... x List of Tables...... xi List of Figures...... xii List of Abbreviations...... xiii
Chapters:
1. General Introduction...... 1 1.1 Solid Phase Peptide Synthesis...... 1 1.2 Peptide Libraries...... 2 1.3 The Importance of Protein Domains to Signal Transduction...... 6
2. Peptide and Protein Sequencing by Partial Edman Degradation…...... 8 2.1 Introduction...... 8 2.2 The Edman Degradation Methodology...... 9 2.3 Peptide Ladder Detection Scheme for Edman Degradation...... 10 2.4 Peptide Ladder Encoding of Libraries...... 10 2.5 Post-Screening Peptide Ladder Encoding of Library Peptides...... 11 2.6 Development of an Improved Partial Edman Degradation Technique…..12 2.7 Experimental Designs and Techniques...... 12 2.7.1 General PIC-Based PED and MALDI-TOF MS…………………….…..12 2.7.2 Modified PIC-Based Procedures…………………………………………13 2.7.3 Replacement of PIC by OSu-Esters……………………………………...14 2.7.4 General OSu-Ester-Based PED……………………………………….….15 2.7.5 Synthesis of Alternative OSu-Esters……………………………………..15 2.7.6 Synthesis of Pro- and Trp-Containing Test Sequences………………..…16 2.7.7 Activated-Disulfide Resin Synthesis for Use in Native Protein PED……17 2.7.8 Immobilization of Tryptic Digest of Co2+E. coli Peptide Deformylase....18 2.8 Results and Discussion…………………………………………………..19 2.9 Conclusion……………………………………………………………….25
vii 3. Determination of the Phosphopeptide Ligand Specificities of SHP-2 and SHIP SH2 Domains by Combinatorial Peptide Library Screening………………...... 36 3.1 Introduction……………………………………………………...... 36 3.2 Experimental Procedures…………………………………………...... 38 3.2.1 Vector Constructs……………………………………………...... 38 3.2.2 SHP-2 SH2 Domain Constructs………………………………...... ….…44 3.2.3 Control and SHP-1 Constructs…………………………………...... …46 3.2.4 Purification and Biotinylation of His6-MBP-SH2 Proteins…...... …46 3.2.5 Purification of His-tagged SH2 Domains………………...... 47 3.2.6 Purification of Full-Length SHP-2 for Stimulation Assays…...... …47 3.2.7 Synthesis of pY Library…………………………………...... …48 3.2.8 Colorimetric Library Screening……………………...... 49 3.2.9 Partial Edman Degradation and Peptide Sequencing………………...... 50 3.2.10 Synthesis of Biotinylated pY Peptides………………………...... …51 3.2.11 Determination of Dissociation Constants by BIAcore……………...... 52 3.3 Results…………………………………………………...... 53 3.3.1 Library Design, Synthesis, and Screening……………………...... …53 3.3.2 Peptide Sequencing by PED………………………………...... 55 3.3.3 Specificity of the C-SH2 Domain of SHP-2…………………...... 55 3.3.4 Specificity of the N-SH2 Domain of SHP-2……………...... 57 3.3.5 Specificity of SHIP SH2 Domain…………………………...... 59 3.3.6 Affinity Measurements of Selected Sequences………………...... 59 3.3.7 Database Search of Potential SHP-1/SHP-2-Binding Proteins………...... 60 3.4 Discussion…………………………...... 61
4. Determination of the Tetrapeptide Ligand Specificities of the BIR2 and BIR3 Domains of XIAP by Combinatorial Peptide Library Screening……………..…77 4.1 Introduction……………………………………………………...... …77 4.2 Experimental Techniques………………………………………...... …80 4.2.1 Vector Constructs…………………………………………………...... 80 4.2.2 XIAP BIR Domain and Full-Length Constructs……………………...... 81 4.2.3 Purification and Lableling of His6-MBP-BIR Proteins……………….....83 4.2.4 Purification of GST-BIR1-3, GST-XIAP, and GST Control Proteins…...84 4.2.5 Synthesis of BIR Libraries…………………………………………….....84 4.2.6 Colorimetric Library Screening………………………...... ….85 4.2.7 Fluorimetric Library Screening…………………………...... 85 4.2.8 Partial Edman Degradation and Peptide Sequencing………………...... 86 4.2.9 Synthesis of biotinylated pY peptides………………………...... 87 4.2.10 Determination of Dissociation Constants by BIAcore……………...... …88 4.3 Results………………………...... 88
viii 4.3.1 Library Construction and Screening………………...... 88 4.3.2 Binding Specificity of the BIR2 Domain……...... …..90 4.3.3 Binding Specificity of the BIR3 Domain……………………...... 91 4.3.4 Affinity Measurements of Selected Peptides………………………….…92 4.3.5 Database Search for Potential BIR2 and BIR3 Binding Partners…...... …92 4.3.6 Probing BIR3-Caspase-10d Interactions...... 94 4.4 Discussion………………………...... 94 4.5 Conclusion...... 97
5. Materials and General Methods…………………...... ……….108 5.1 Materials………………………...... 108 5.2 Buffers………………………………...... 109 5.3 General Biochemical and Biological Methods…....……………………111 5.3.1 Materials …………………...... …111 5.3.2 Growth Media………………………………...... 111 5.3.3 Growth and Storage of Bacterial Strains…………….…………………112 5.3.4 Preparation of Competent Cells…………………...... 113 5.3.5 Quantitation of DNA and RNA………………...... ……113 5.3.6 Protein Quantitation………………...... ………114 5.4 Electrophoresis……………………….………...... …………114 5.4.1 Agarose Gel…………………………...... 114 5.4.2 Polyacrylamide Gels for Protein Separation………………………...... 115 5.4.3 Urea-PAGE Gels for Oligonucletide Purifiation…………………….....116 5.5 Recombinant DNA techniques...... 117 5.5.1 Restriction Digestions...... 117 5.5.2 Filling Recessed 3’-Termini and Removing Protruding 3’-Termini…...118 5.5.3 Removal of 5’ Phosphates……………...... 118 5.5.4 Ligation of DNA……………...... 119 5.5.5 Transformation…………………...... 119 5.5.6 Small-Scale Preparation of Plasmid DNAs………………………...... 120 5.5.7 Mutagenesis………………………...... …121 5.5.8 Sequencing……………………………...... …121
Appendix: Supplementary Schemes, Tables, and Figures...……………………………122
Bibliography……………………...... ………………………………….134
ix LIST OF SCHEMES
Scheme Page
2.1 Traditional Edman degradation...... 26
2.2 Solution-phase peptide-ladder sequencing...... 27
2.3 Support-bound peptide-ladder sequencing...... 29
A1 Construction of pETMAL vector...... 123
A2 Construction of pET-PNPT vector...... 124
A3 Construction of pPPTmal vector...... 125
A4 Construction of pGFPmal vector...... 126
x LIST OF TABLES
Table Page
2.1 Virtual EcPDF tryptic digest...... 34
3.1 SHP-2 C-SH2 domain selected peptides...... 65
3.2 SHP-2 N-SH2 domain selected peptides...... 66
3.3 SHIP SH2 domain selected peptides...... 67
3.4 Dissociation constants of SH2 domains toward selected pY peptides...... 72
3.5 Potential human SHP-1/SHP-2 interacting proteins from database search...... 74
4.1 XIAP BIR2 domain selected peptides...... 98
4.2 XIAP BIR3 domain selected peptides...... 100
4.3 Dissociation constants of BIR domains toward selected peptides...... 102
4.4 Potential human BIR2 interacting proteins from database search...... 103
4.5 Potential human BIR3 interacting proteins from database search...... 106
A1 Additional SHP-2 C-SH2 selected peptides...... 128
A2 Additional SHP-2 N-SH2 selected peptides...... 129
A3 Additional SHIP SH2 selected peptides...... 130
A4 Peptide sequences selected during BIR3 domain fluorimetric screening...... 131
xi LIST OF FIGURES
Figure Page
2.1 MALDI-MS sequence of [Glu1] fibrinopeptide B...... 28
2.2 MALDI-TOF MS of support-bound peptide...... 30
2.3 MALDI-TOF MS of support-bound pY peptide...... 31
2.4 MALDI-TOF MS of Pro- and Trp-containing peptides...... 32
2.5 MALDI-TOF MS of Trp oxidation products...... 33
2.6 MALDI-TOF MS of immobilized EcPDF tryptic digest...... 35
3.1 MALDI-TOF MS of C-SH2 domain selected pY peptide...... 68
3.2 Histograms of SHP-2 C-SH2 domain selected pY peptides...... 69
3.3 Histograms of SHP-2 N-SH2 domain selected pY peptides...... 70
3.4 Histograms of SHIP SH2 domain selected pY peptides...... 71
3.5 BIAcore sensorgram and secondary plot...... 73
4.1 Histograms of BIR2 domain selected peptides...... 99
4.2 Histograms of BIR3 domain selected peptides...... 101
A1 SDS-PAGE gel of full-length SHP-2...... 127
A2 Composite histogram of SHP-2 N-SH2 selected sequences...... 132
A3 Histogram of SHP-2 N-SH2 selected Class III and IV peptides...... 133
xii ABBREVIATIONS
AcCN Acetonitrile
BOC t-Butyloxycarbonyl
Biotin-OSu D-Biotin O-Succinimide
BSA Bovine Serum Albumin
β β-Alanine, β-Aminopropionic Acid
BCIP 5-Bromo-4-chloro-3-indolyl Phosphate
BME β-Mercatoethanol
Bz-OSu Benzoic Acid O-Succinimide
CLEAR Cross-Linked Ethoxylate Acrylate Resin
CIP Calf Intestinal Alkaline Phosphatase
DCM Dichloromethane
DIPEA N,N-Diisopropylethylamine
DMF N,N-Dimethylformamide
DNA Deoxyribonucleic Acid dNTPs Deoxyribonucleotide Triphosphates
DTNB 5,5’-Dithiobis(2-nitrobenzoic acid), also Ellman’s Reagent
DTT Dithiothreitol
EDTA N,N,N’,N’-Ethylendiamine Tetraacetate
ESI-MS Electrospray Ionization-Mass Spectrometry
xiii FMOC, Fmoc 9-Fluorenylmethoxycarbonyl
FPLC Fast Protein Liquid Chromatography
GFP Green Fluorescent Protein
GST Glutathione S-Transferase
GSH Glutathione
HBTU N-[(1H-Benzotriazol-1-yl)(dimethylamino)-methylene]-N-
methylmethanaminium hexafluorophosphate N-oxide
HPLC High Performance Liquid Chromatography, sometimes
High Pressure Liquid Chromatography
HOBt 1-Hydroxybenzotriazole iPrOH Isopropanol
IPTG Isopropyl-β-D-thiogalactoside
IMAC Immobilized Metal Affinity Chromatography
LBA/LBK Luria-Bertani media + Ampicillin or Kanamycin Antibiotic
MALDI-TOF Matrix Assisted Laser Desorption Ionization – Time Of Flight
MBP Maltose Binding Protein
MCS Multiple Cloning Sites
MeOH Methanol
Nic-OSu Nicotinic Acid O-Succimimide, also Nic-NHS
Nle Norleucine
xiv NMM N-Methylmorpholine
PCR Polymerase Chain Reaction
PED Partial Edman Degradation
PEG Polyethylene Glycol
PEGA Polyethylene Glycol Dimethylacrylamide
PIC Phenylisocyanate
PITC Phenylisothiocyanate pY Phosphotyrosine
SA-AP Streptavidin-Alkaline Phosphatase
SDM Site-Directed Mutagenesis
SDS-PAGE Sodium Dodecylsulfate-Polyacrylamide Gel Electrophoresis
SPPS Solid Phase Peptide Synthesis
SPR Surface Plasmon Resonance
TCEP Tris(2-carboxylethyl)phosphine
TEA Triethylamine
TFA Trifluoroacetic Acid
Standard one letter codes are used for deoxynucleotides occurring in PCR primers, while standard one and three letter abbreviations are used for genomically encoded amino acid residues.
xv
CHAPTER 1
GENERAL INTRODUCTION
1.1 Solid Phase Peptide Synthesis
In 1963 R. Bruce Merrifield successfully demonstrated a technique for tethering
amino acids to an insoluble polymer support followed by step-wise peptide bond
formation and eventual cleavage from the support. His initial synthesis of a simple
model tetrapeptide [1] marked a significant break from traditional synthetic methodology
involving purification and characterization of products following each step. Moreover,
the ease with which this biphasic system allowed removal of unreacted reagents meant
that large excesses could be used in the coupling reactions to ensure completion at each
step. As a result, in most cases truncation products are limited and desired products can
be obtained in good yield following a single purification. The revolutionary concept of solid phase peptide synthesis (SPPS) has since been applied to the synthesis of other polymers, such as oligonucleotides [2] and oligosaccarides [3], and is the very basis for combinatorial library synthesis [4-6]. The ability to synthesize biologically relevant molecules, either individually or combinatorially, in a routine manner has had profound effects on the speed and types of biochemical research performed to date and Dr.
Merrifield was awarded the Nobel Prize in Chemistry for his groundbreaking technique in 1984.
1 In the forty plus years since the introduction of SPPS, numerous modifications
and improvements have been made in the strategies and reagents involved. Most
significantly, the “BOC/HF” strategy which evolved first [7-9] and relied on the relative
acid stabilities of permanent side-chain versus temporary α-amino protecting groups
between cycles, has been largely supplanted by the orthogonal “FMOC/piperidine”
approach. In this latter scheme, temporary α-amino protection is afforded by the base
labile 9-fluorenylmethoxycarbonyl group (FMOC) developed by Carpino [10], whereas the side-chain protecting groups are unreactive under basic condition and are removed by acid treatment. Additional improvements have been made in coupling reagents and the resin supports. The introduction of the phosphonium , aminium/uronium [11], and acid
fluoride-based carboxylate activating reagents have improved yields and reduced
racemization difficulties compared to carbodiimide reagents [reviewed in 12].
Meanwhile, incorporation of polyethyleneglycol (PEG) into resins, either as a graft onto polystyrene (e.g. TentaGel, Argogel) or as the scaffolding itself (e.g. PEGA, CLEAR) has improved the solvation and swelling properties of the resins for improved coupling efficiency, especially for difficult syntheses [13, 14]. Moreover, the amphipathic nature of the PEG linker is important in the current work as it allows for synthesis of peptide libraries in organic solvents and biological screening reactions in aqueous environments.
1.2 Peptide Libraries
In the early 1990’s, the development of large peptide libraries generated from randomized DNA sequences expressed and presented in the coat proteins of bacteriophage, referred to as phage display, marked an important advance in biochemistry [15]. Shortly thereafter, Geyson’s earlier work [16] was expanded into 2 large synthetic combinatorial peptide libraries using SPPS techniques [4-6]. More
recently, other chimeric nucleotide-peptide screening systems have been demonstrated
[17-20]. These powerful tools have allowed researchers to rapidly sample large regions
of ligand space in the search for binding interactions and characterization of enzymatic
reactions. Each technique has been extensively reviewed [21-23] and each has distinct
advantages and disadvantages. In many ways these techniques serve to complement each other in terms of the types, sizes, and constitutions of the peptide libraries obtained and
thus the types of binding or enzymatic screenings capable. For the sake of brevity, the
preponderance of discussion will be related on the phage display and common SPPS
library methods here.
Peptide phage display library screening represents a remarkable collaboration
between chemistry and biology. The randomized DNA library is chemically synthesized,
whereas the peptide epitopes are biologically synthesized [15]. Paramount to the successful application of this relationship, however, is that each DNA message be physically linked to the epitope it encodes. Thus, each phage capsid serves not only to display peptides for screening, but also to segregate and record the identity of each
selected peptide. Additionally, because of the infectious and reproductive nature of the
viral particle, selected phages can be harvested and propagated for subsequent rounds of
selection. After several rounds of competition and enrichment, very high affinity binding
epitopes can be elucidated by this methodology. Moreover, because the record-carrying
DNA the can be amplified in the sequencing reaction, very little sample is needed and
detection limits are not of concern. Such amplification is not possible during protein
sequencing of synthetic libraries and marks a significant advantage of nucleotide
encoding [24]. 3 Phage display has been applied with success to the elucidation and refinement of peptide ligands for a wide range of targets [25-27]. However, there are drawbacks to systems reliant upon biology for library synthesis. First and foremost, until recently, these libraries were limited in their composition to the 20 naturally occurring L-amino acids. In the instances where non-natural amino acids have been incorporated by amber suppression, diversity remains relatively limited and separate screenings are necessary for each analogue (i.e. direct competition is not yet possible). Furthermore, at present biosynthetic incorporation of non-proteinogenic residues is by no means trivial [28-33].
And second, positional biases, albeit relatively mild, have been demonstrated in phage display libraries stemming from the selective pressures of host-phage interactions [34-
36].
Synthetic peptide libraries can be designed for screening in two forms, solution phase and resin-bound. Both are generated by the split-pool method [4, 6], but solution phase libraries are cleaved from the resin before screening, whereas resin-bound libraries remain tethered to the support after de-protection such that many copies of one molecule
(~100 pmol) are displayed per bead (the one-bead one-compound, OBOC, principle). In contrast to the biological synthesis methods above, nucleotide encoding of synthetic libraries is burdensome and rarely employed [37, 38], and so selected peptides must be analyzed by a means other than PCR. The lack of an amplification mechanism introduces detection-limit considerations for synthetic libraries, and strategies for encoding libraries and deciphering the selected targets have been creative and varied
(reviewed in [39-41], discussed further in Chapter 2). Soluble peptide libraries have been successfully screened for ligand binding interactions, enzyme substrate preferences
[42], and inhibitors. In most cases, selected peptides are sequenced by pooled Edman 4 degradation or electrospray ionization-mass spectrometry (ESI-MS). As well as the
potential for missing low-copy, high-affinity ligands, pooled sequencing methods cannot
distinguish multiple, contextual binding/substrate motifs. Likewise, there are limits to
ESI-MS techniques, such as library size and potential sample loss during de-salting.
Solid-phase libraries offer an advantage by keeping each peptide separate and thus,
separately identifiable. This advantage can be especially helpful in the identification of high-affinity ligands that may not conform to more highly represented motifs and the ability to distinguish among multiple motifs (see Chapter 3). The drawback lies in the need to sequence large numbers of positively selected sequences rapidly and inexpensively.
Overcoming deconvolution difficulties associated with synthetic libraries by whichever means demonstrated in the literature, or as yet undescribed, opens doors to ligand space not attainable by genetically encoded systems. Although not as numerically diverse (usually 107~106 members) or as positional representative (usually 5~6 fully
randomized positions) as their transcribed counterparts, synthetic libraries offer
incomparably more direct access to incomparably more diverse unnatural monomers.
Moreover, multiple non-encoded building blocks can be incorporated, allowing direct
competition and selection between them. Indeed, libraries can be composed entirely of
unnatural D-amino acids in the search for protease-resistant receptor agonists.
Additionally, post-translationally modified (acetylated, formylated, methylated,
phosphorylated, etc.) amino acids are difficult or impossible to include in biologically-
originated libraries in an unbiased manner, but are generally amenable to synthetic
incorporation. Thus, there are types of screenings for which each library design is best
suited. 5
1.3 The Importance of Protein Domains to Signal Transduction
The pioneering research lead by Tony Pawson in the 1980’s and 90’s, in which the Src Homology 2 (SH2) domain was functional characterized as a conserved, self- folding, phosphotyrosine (pY) binding entity, elevated the role of recognition domains in signal transduction and cellular regulation [43-45]. The subsequent characterization of at least 50 conserved recognition domains, each binding substrates as chemically and spatially diverse as nucleic acids, lipids, and peptides, has helped bridge signaling pathways from the membrane to the nucleus and points in-between [46]. Diverse recognition events by diverse domains can be linked by the modular nature of the domains to create natural “fusion proteins” such as Grb2, a small adaptor protein composed of two SH3 and an SH2 domain. The result, in this example, is the linkage of a pY recognition event to two different Pro-based recognition events via the physically linked SH2 and SH3 domains, respectively. Thus, such an adapter molecule can both amplify a signal, and build specificity into a pathway and response via the inherent specificities Grb2’s fused domains. For instance, the SH2 domain of Grb2 not only recognizes pY, it does so in a contextually specific manner based on the surrounding C- terminal residues, in this case a sequence of pYXN, where X represents any amino acid, is the preferred substrate motif. This is in contrast to the SH2 domain of Src, which prefers a pYEEI motif. Likewise, the two SH3 domains within Grb2 prefer slightly different PXXP or RXXP motifs. Considering there are 115 SH2 and 253 SH3 domains encoded by the human genome, each possessing unique binding preference, the combinatorial signaling possibilities attainable through their fusion seems almost limitless. 6 One deconstructionist means of probing signaling pathways and directing their study is through the application of combinatorial library screening of the individual constituent domains. The use of phage display and other biologically derived libraries for the probing of domains which recognize unmodified peptides, e.g. SH3, is nicely complemented by SPPS-based libraries for understanding the specificity of domains requiring post-translational modifications, such as SH2, Bromo, Chromo, etc. Research involving pY peptide library screening led by Lewis Cantley [47] demonstrated the feasibility of such an approach, and in fact allowed the sub-classification of SH2 domains based their recognition preferences. Subsequent refinement and expansion of this application has been ongoing and we endeavored to add to the understanding signal transduction while improving the methodology of library screening.
7
CHAPTER 2
PEPTIDE and PROTEIN SEQUENCING
by
PARTIAL EDMAN DEGRADATION
2.1 Introduction
Equally important as designing and synthesizing a high quality peptide library for
screening is “reading” the results of that screening. While the generally encountered
chemical strategies by which to synthesize libraries are few, BOC/HF or
FMOC/piperidine, the schemes by which to decode the selected sequences are many and
are not limited to chemistry. The difficulty is finding a means of robustly encoding large
numbers (generally ≥ 106) of peptides with unique identifiers that, ideally 1) won’t
interfere with the binding or catalysis being studied; 2) won’t bias the library; 3) can be read quickly and cheaply; 4) can be incorporated quickly and cheaply; and 5) in chemically encoding cases, can be detected at fairly low limits with high accuracy.
Attempts to devise an ideal sequence encoding method have been far-ranging, and while some schemes are more promising than others, each fails at least one criterion of ideality.
A few examples of resin-bound non-chemical library encoding methods include spatial
segregation (i.e. peptide arrays), radio frequency tags, and optical diversification, among
others. Several of the exogenous chemical-tagging systems described involve adding
haloaromatics, secondary amines, and nucleotides during library synthesis. The term 8 exogenous is included in the previous descriptor because the peptides are themselves
chemicals and can in fact chemically encode their own sequence, endogenously. This
self-encoding nature can be exploited in several ways, but unlike oligonucleotides, amplification of the message is not possible, and thus detection limits are always of concern.
2.2 The Edman Degradation Methodology
The chemical methodology of sequentially removing and identifying each N- terminal residue of a protein or peptide using phenylisothiocyanate (PITC) was originally
described by Pehr Edman in 1950 [48]. Optimization of the degradation technique
(Scheme 2.1) which bears his name has yielded an automated process by which high
performance liquid chromatography (HPLC) identification of the degradation product, a
phenylthiohydantion (PTH) bearing the N-terminal residue, can routinely be achieved at
limits of 5-10 pmol of peptide sample [4, 49]. In many ways, determining the sequence
of selected peptides by way of the inherent peptide sequence seems ideal. However, with
regard to solution phase library screenings, the selected peptides are necessarily a
mixture, and thus Edman degradation can only yield the most preferred residues at each
position. With regard to resin-bound libraries, sequencing each positive bead (often >
100 selected) is cost and time prohibitive. Therefore, selected beads are often pooled and
sequenced as a mixture. Unfortunately, this act of pooling negates some of the power of
the one-bead-one-compound (OBOC) principle since only the most preferred residue(s)
at each position are observable. Contextual data, the contribution of one residue toward
binding or enzymatic specificity in the context of other residues in the same peptide, can
only be had by decoding each sequence individually. This topic will be revisited. 9
2.3 Peptide Ladder Detection Scheme for Edman Degradation
Traditionally, after cleavage from the parent peptide the PITC derivative must be extracted, treated for rearrangement, and identified by retention time in an HPLC. Chait and colleagues demonstrated a novel approach [50]. Instead of handling and analyzing the small PTH derivative produced after each degradation cycle by a time-consuming
HPLC method, they generated a “peptide ladder” out of the parent peptide through competitive capping and degradation (Scheme 2.2). The N-terminal capping reagent included as a small percentage of PITC was its oxygen analogue, phenylisocyanate (PIC), which formed a stable carbamate and blocked degradation of a small number of peptides in each cycle. Analysis of the generated peptide ladder was quick and all at once by mass spectrometry. In the mass spectrum, a peak was observed for each capping product like rungs of a ladder. Each rung differed from the next by the mass unit of the amino acid next to it in the sequence, and thus the ordered identity of most amino acids was obtained in one shot (Fig. 2.1, taken from [50]). Their work, conducted in a solution phase spinning cup sequenator, demonstrated the feasibility of constructing and analyzing a peptide ladder through Edman chemistry and mass spectrometry.
2.4 Peptide Ladder Encoding of Libraries
In a subsequent application of the peptide ladder-mass spectrometry sequencing concept to library encoding, Youngquist et al. described a method in which a peptide ladder was encoded during synthesis of an on-bead library [51]. This method allowed the peptides to act as their own endogenous chemical tags. In each cycle of peptide elongation, a small percentage of N-acetylated amino acid was incorporated along with 10 the FMOC-protected amino acid, thus terminating further elongation for this small percentage by capping the N-terminus. After screening, the full-length and chain- terminated peptides, each differing from the next by one amino acid, were released from the resin into individual vessels and sent through a mass spectrometer. The mass-identity of each residue was obtained in sequential order for each peptide individually in marked contrast to the pooled sequencing technique. Moreover, this technique was exceedingly fast, cheap, reliable, and had very low detection limits. However, due to the presence of truncation products during screening, unequivocal binding to the full-length product was not obvious. Moreover, the non-uniform reactivities of the amino acids during SPPS meant that the ratio of capping to elongation was not equal among amino acids, and thus a bias against certain residue’s representation in the library could not be ruled out, especially for longer sequences.
2.5 Post-Screening Peptide Ladder Encoding of Library Peptides
Chait’s technique, dubbed partial Edman degradation (PED), was subsequently utilized in our lab for sequencing peptides selected from an on-bead library (Scheme 2.3)
[52]. In this way, the peptides presented during screening were nearly all full-length and unbiased, and yet an individual sequence was obtained for each binding peptide rapidly and at low cost. However, the reagent PIC proved to be difficult to control and reproducibility of the PED sequencing results was less than satisfactory. Thus we endeavored to modify the technique, altering the solvent, substituting for PITC, and finally by replacing PIC. The eventual result was a more robust, forgiving, and easily optimized protocol based on the substitution of PIC by the less reactive O-succinimide
(OSu) esters of benzoic and nicotinic acid. 11
2.6 Development of an Improved Partial Edman Degradation Technique
In the course of screening pY and N-terminal libraries (design, construction, and screening of libraries are described in Chapters 3 & 4), we, along with colleagues in a neighboring lab, found that the determination of full-length sequences for these short peptides was frustratingly infrequent and irreproducible. Optimization of conditions and the ratio of PITC:PIC reagents against one batch of test beads failed to yield similar results the next day, or even in parallel experiments. A systematic deconstruction of the degradation scheme ensued.
2.7 Experimental Designs and Techniques
2.7.1 General PIC-Based PED and MALDI-TOF MS
Fresh library beads (50-100), each containing ~100 pmol of covalently attached peptide, were placed in a glass-fritted vessel to allow filtering. After washing with dichloromethane (DCM) and methanol (MeOH), the beads were suspended in 250 µL of degradation solvent (1:1 H2O:pyridine). In an eppendorf microcentrifuge tube, enough degradation solution was prepared for two cycles by adding 110 µL of of PITC from a freshly opened ampoule and 5.5 µL of fresh PIC (5% v/v) to 435 µL of pyridine. The degradation solution (250 µL) was added to the suspended beads with rapid mixing by pipette. The mixture was incubated at room temperature for 10 min, drained, washed with MeOH, DCM, and anhydrous trifluoroacetic acid (TFA), and suspending in TFA for
10 min. After draining and washing, the beads were re-suspended in degradation solvent and the cycle was repeated until the final position of the library, at which point, only PIC was added. 12 Submission for MALDI-TOF mass spectrometric analysis was performed
virtually unchanged throughout the work presented here. Following any PED procedure, the beads were subjected to a reductive work-up in order to reduce any methionine sulfoxide that may have formed in the course of screening or degrading. Reduction was performed by suspending the beads in the degradation vessel in ~1 mL of TFA on ice.
After 5 min on ice, dimethyl sulfide (20 µL) and ammonium iodide (10 mg) were added.
The vessels were incubated on ice with intermittent mixing by hand for 20 min, drained, washed with TFA. Following extensive washing with ddH2O, the beads were transferred
to individual microcentrifuge tubes and allowed to dry for 1 hr. The peptide ladders were
then cleaved from the resin by treatment overnight in the dark with 20 µL of 70% TFA
containing CNBr (20 mg/ml). Upon evacuation to dryness in a SpeedVac, the peptide
mixtures were re-suspended in 5 µL of 0.1% TFA, from which 1 µL of the peptide
solution was mixed with 2 µL of 0.1% TFA in 50% acetonitrile saturated with 4-
hydroxy-α-cyanocinnamic acid and spotted onto a 96-well MALDI-TOF sample plate.
MALDI-TOF mass spectrometry was performed on a Bruker Reflex III instrument in an
automated manner. Sequence determination from the mass spectra was performed
manually.
2.7.2 Modified PIC-Based Procedures
Experiments were attempted in which 1) the ratio of PIC:PITC was varied from
0.5% to 20%; 2) the solvent ratio was altered from 3:1 to 1:3 H2O:pyridine; 3) the reaction temperature was increased to 42 ºC; and 4) combinations of these alterations were explored. Some conditions, the higher PIC concentrations, accentuated N-terminal residues, but left the later positions un-interpretable. Conversely, some conditions 13 favored the C-terminus to the detriment of the earlier residues. Moreover, when
promising conditions were struck upon, consistent reproduction was absent.
The next course of action was to experiment with a PITC analogue. It was
expected that PIC was more electrophilic than PITC and so an attempt to increase the
latter’s reactivity towards the N-terminus during the coupling phase was made. However, it has long been appreciated that increased electrophilicity during the coupling phase results in decreased nucleophilicity of the thiocarbamoyl group during the acid promoted cleavage phase. Nevertheless, p-nitrophenylisothiocyanate was substituted for PITC in
three experiments. Competition reactions containing 0.9, 1.8, and 3.5% PIC (v/v) were
performed. Because of the electron withdrawing effect of the nitro substituent, the TFA
phase of the cycle was doubled to 20 min to promote cyclization and cleavage.
2.7.3 Replacement of PIC by OSu-Esters
Following attempts to replace PITC, a more controllable capping versus
degradation competition chemistry was still desirable. It therefore fell to replacing PIC
with an amine reactive species that would deliver an acid stable moiety. Sometimes
referred to as N-hydroxysuccidimdyl esters, OSu derivatives of carboxylic acids have
been used to good avail as relatively stable amine-reactive agents for capable of protein
modification in aqueous environments. In our own lab, we have used D-biotin-OSu to
label proteins for screening (Chapters 3 & 4). A strategy was devised in which
inexpensive, commercially available OSu derivatives would be substituted for PIC,
beginning with benzoic acid-OSu (Bz-OSu). This reagent yielded significant
improvements and promising results from the very first attempt, but eventually was itself
replaced by nicotinic acid-OSu (Nic-OSu). Three additional OSu-esters were synthesized 14 for use as novel capping reagents from (3-carboxypropyl)trimethylammonium chloride,
p-bromobenzoic acid, and N-methylnicotinic acid.
2.7.4 General OSu-Ester-Based PED
The method is subtly modified from the PIC-based procedure. Approximately 50
beads were placed in the degradation vessel and washed as before and 500 mM Bz-OSu
or a 400 mM Nic-OSu [crystallized from ethyl acetate (EtOAc) prior to usage] stock
solution was made in pyridine. The degradation solution (400 µL) was prepared anew every two cycles to contain 5% PITC (20 µL, 0.13 mmol) and a 0.67% mol ratio of OSu ester (0.087 mmol) in pyridine. After suspending the beads in 160 µL of the degradation solvent (2:1 pyridine:H2O), an equal volume of degradation solution was added and the
vessel was swirled by hand for mixing (instead of mixing by pipette) prior to placement
on the hub of a rotary shaker. The coupling reaction proceeded for 6 min before draining and washing as before. Aside from altering the TFA cleavage time from 10 min to 2 x 6 min, all other aspects remained unchanged from the PIC-based procedures.
2.7.5 Synthesis of Alternative OSu-Esters
The OSu-esters of (3-carboxypropyl)trimethylammonium chloride, p- bromobenzoic acid, and N-methylnicotinic acid were synthesized for testing in the PED scheme. A trimethylammoniumbutyric acid-OSu chloride activated ester derived from 3- carboxypropyl)trimethylammonium chloride was synthesized by Anthony Simpson and tested in the PED scheme without purification. The p-bromobenzoic acid-OSu molecule was synthesized by dissolution of the acid (4.0 g, 20 mmol) in anhydrous tetrahydrofuran
(THF) (50 mL) by the addition of N-hydroxsuccinimide (NHS) (2.75 g, 24 mmol) and 15 lastly, 1, 3-diisopropylcarbodiimide (DIC) (3.75 mL, 24 mmol). The reaction was left
stirring overnight under argon. The cloudy white mixture was filtered and washed with
DCM and THF to remove the diisopropyl urea by-product. TLC and NMR showed the
product to be very impure. Flash column chromatography was performed on a small
amount (~0.5 g) using 2:1 hexanes:EtOAc as the mobile phase. NMR showed the product to be pure, but the near-complete insolubility in water-miscible solvents made this OSu-ester ill-suited to PED and it was not tested. Lastly, N-methylnicotinic acid-
OSu iodide was synthesized from the Nic-OSu already present in the lab in a manner similar to Tadjamulia, et al [53]. Nic-OSu (3.5g, 16 mmol) was taken up in freshly prepared anhydrous acetone (65 mL). Iodomethane (2 mL, 32 mmol) was added, a condenser was fitted, and the reaction was heated to 45 ºC for 7 hr. The solid was collected by filtration, washed with acetone, and dried in vacuo. The melting point was measured (202 ºC) and compared to the literature value (222 ºC), indicating impurity.
The product was tested in the PED method in crude form.
2.7.6 Synthesis of Pro- and Trp-Containing Test Sequences
To test the mass spectrometric patterns observed for Pro and Trp residues resulting from PED of the library, two known Pro- or Trp-containing sequences were synthesized. The sequences were selected from two which allegedly occurred from the library for comparison. The Pro and Trp test sequences, FRAPLNβRM-resin and
DFWYLNβRM-resin, respectively, were synthesized on TentaGel S NH2 resin according
to standard FMOC/piperidine protocols employing 4 equivalents Fmoc-amino acids,
HBTU, and HOBt and 8 eq. N-methylmorpholine (NMM) as base in N,N-
dimethylformamide (DMF) for 1 hr. Reaction completion was measured by the absence 16 of color change in the ninhydrin (Kaiser) test. Final de-protection was performed by
treatment with Reagent K [7.5% phenol (w/v), 5% water (v/v), 5% thioanisole (v/v),
2.5% ethanedithiol (v/v), 1% anisole (v/v), and 1% triisopropylsilane (v/v) in TFA] for 1
hr at room temperature followed by extensive washing. Degradation and MALDI-TOF
MS were as before.
2.7.7 Activated-Disulfide Resin Synthesis for Use in Native Protein PED
An attempt was made to apply the PED technique to the sequencing of full-length
proteins reversibly immobilized on a solid-support. Disulfide chemistry was chosen as
the means of immobilization, with reduction allowing facile release from the resin. Thus,
100 mg of TentaGel M NH2 (10 µm) (Rapp Polymere GmbH) resin was functionalized
with a cysteine residue by reaction with 4 equivalents of Fmoc-amino acid as in 2.7.5
above. The N-terminus was acetylated by 8 eq. of acetic anhydride with 0.1% N,N- dimethylaminopyridine (DMAP) catalyst in 1:1 DCM:DMF for 1 hr at room temperature.
This capping reaction was repeated once. Removal of the S-trityl protection of Cys was affected by Reagent K treatment for 1 hr followed by extensive washing.
Next, a resin-bound activated disulfide was prepared. The resin was suspended in
3 mL of Buffer 1 before the addition of 50 µL of 100 mM tris(2-carboxyethyl)phosphine
(TCEP) for 15 min. Immediately after washing with water, the beads were suspended in
3 mL of 15 mg/mL NaHCO3 containing ~150 mg of 5,5’-dithiobis(2-nitrobenzoic acid)
(DTNB) for 5 min. This charging of the Cys side-chain was repeated once before the resin was washed with water and MeOH. After drying the resin was stored at -20 ºC until used.
17 An alternative solid support was synthesized by derivitizing cysteinyl-Amino
BioMac1800 (Biosearch Technologies) macroporous resin with Aldrithiol-2 (2,2’- dithiodipyridine). Briefly, 100 mg of the BioMac resin was reacted with Cys and acetylated as above. After de-protection and reduction as before, the resin was suspended in 3 mL of ethanol containing 4 eq of Aldrithiol-2 for 2 hr. The resin was washed, dried, and stored at -20 ºC.
2.7.8 Immobilization of Tryptic Digest of Co2+E. coli Peptide Deformylase
The Co2+ substituted C-terminally six-His-tagged peptide deformylase from E. coli (EcPDF-His6) was purified in our lab by Kiet Nguyen and Grace Zhou according to protocol [54]. A glycerol stock of the protein was buffer exchange into Buffer 2 by passage through a Sephadex G-25 fast de-salting column and quantitated by the Bradford method (BioRad). Immobilization was performed in two ways: 1) The trysin digest of
EcPDF was performed under de-naturing conditions before immobilization, and 2)
EcPDF was de-natured and immobilized, and then digested on the resin by trypsin. In the first case, to 47 µL containing 500 ng (~240 pmol) of EcPDF in Buffer 2 was added 33
µL of 3X Buffer 3. The solution was heated to 75 ºC for 2.5 min in a heating block before the addition of ice-cold ddH2O (19 µL) and incubation on ice for 5 min. After 5 min at room temperature, 1 µL of trypsin (1 mg/mL in Buffer 4) was added and the solution was place in a 37 ºC oven for 9 min. The digest mixture was then transferred to
1.5 mg of a mixed disulfide resin suspended in Buffer 5 and incubated for 45 min. This mixture was transferred to a degradation vessel by several rinses with ddH2O. The resin was further washed with water and MeOH, followed by 5 cycles of PED with Nic-OSu.
Subsequent reduction of the disulfide linkage was performed overnight in a 18 microcentrifuge tube in the presence of 2 eq of TCEP in 45 µL water. Samples were de-
salted prior to submission for MALDI-TOF analysis by performing two ZipTip
(Millipore) C18 pipette tip purifications on each according to manufacture’s protocol.
The on-resin trypsin digestion was only performed using the DTNB activated
TentaGel resin. Two 10 mg aliquots of resin were suspended in 50 µL 3X Buffer 3 (pH
8.4) or (pH 7.8) in Micro BioSpin columns (0.8 mL, BioRad), to each of which was
added 60 µL of EcPDF (4.2 mg/mL). The mixtures were heated to 75 ºC for 3 min and
then placed at room temp for 1 min before draining and washing with water. This
disulfide coupling reaction was repeated once. The resin was washed with water and suspended in Buffer 6 containing 3 µL trypsin (1 mg/mL) before incubating at 37 ºC for 6 hr. The digest was repeated with fresh trypsin for another 6 hr. Extensive washing preceded TCEP treatment for disulfide release and ZipTip clean-up for MALDI-TOF submission.
A virtual trypsin digest with [M+H]+ fragment calculations (Table 2.1) was
performed using the PeptideMass program found on the ExPASY Proteomics Server web
site: www.expasy.org. Only two Cys residues are present in EcPDF, and thus only these fragments are expected in the mass spectrum.
2.8 Results and Discussion
Despite its successful employment in our laboratory during the sequencing of a tetrapeptide library [52], as described earlier in the text, the PIC-based partial Edman
degradation technique proved problematic and unreliable. Our endeavors to improve
upon the procedure involved analyzing each component of the reaction including the
solvent composition, the degradation and capping reagents, and the temperature and 19 duration of the reaction. Reliability was finally achieved upon replacement of the highly reactive capping reagent PIC by the more controllable OSu-based reagents. The full- length sequencing success rate achieved with Bz-OSu averaged 95% in five trial degradations of ~50 beads from the pentatpeptide NH2-XXXXLNββRM-resin library
(example mass spectrum in Fig. 2.2).
Next, the library NH2-TAXXpYXXXLNββRM-resin was tested because it presented a much more daunting challenge. First, three constant residue positions must be partially degraded in addition to the five random ones (eight total cycles, plus a final cap) in order to achieve full-length sequences. This offered many more opportunities for error and tested the forgiving nature of the method. Second, the incorporation phosphotyrosine (pY) in the library had repeatedly been demonstrated in our lab to decrease the detection of residues N-terminal to it. This is believed to be due to the strong double negative charge imparted by the phosphate moiety which must be overcome during ionization in the positive mode. Thus, every peptide fragment that contains the pY residue (i.e. all truncation products N-terminal to pY) has a handicap toward ionization in the positive mode due to the inherent chemical nature of pY. The inclusion of Arg in the linker as a locus of positive charge acceptance is meant to overcome this handicap, but its compensation was incomplete as demonstrated in Figure
2.3.
However, the inclusion of pY residues in libraries is necessary for screening SH2 or PTB domains, and thus sequencing problems associated with it must be overcome.
With this in mind, one degradation of the pY library was performed with Bz-OSu and one with Nic-OSu as capping reagent. Only 24 beads from each were sequenced, but the Nic-
OSu reagent produced a higher percentage (92%) of full-length readable mass spectra 20 compared to the benzoylating agent (75%). It seemed likely that the conditions could be improved for both reagents, but Nic-OSu was more soluble and easier to work with than the Bz analogue. Moreover, it was reasoned that the mildly basic nicotinoyl appendage might work to restore a measure of the basicity lost upon acylation of the N-terminus, and thus improve the sensitivity of the detection method during positive ion mode, especially for pY peptides. Thus, Bz-OSu was abandoned in favor of its nicotinoyl cousin.
Non-commercial alternatives to Nic-OSu were explored as well. In attempts at increase ionization efficiency, and thus lower detection limits, the incorporation of permanent positive charges at the N-terminus were explored via the trimethylammoniumbutyric acid and N-methylnicotinic acid capping reagents. These
OSu-esters were tested in the PED method in crude form without purification and without success. If successful, these reagents would have been ideally suited for the native protein degradation technique explored later, since incorporation of a strongly basic ionizing site in the linker would not be possible in this situation (vide infra). Moreover, lowering detection limits would also be desirable in library sequencing applications, since lowered limits translates to smaller bead size, and thus, larger diversity and greater numbers of randomized positions. However, easy success with the Nic-OSu agent and difficulties encountered with solubilizing and purifying these reagents lead to their abandonment after merely cursory examination in the application of library sequencing.
One of the limitations of the PED technique described here is the low rate of
detection of Pro and Trp residues as their residue masses. Instead, these amino acids
were initially detected in absentia (i.e. gaps in sequences that corresponded to the weight
of either Pro or Trp plus some other amino acid, see top-most peak differences “P+A”
and “W+F” in Figure 2.4). Upon inspection of these gaps, consistent patterns were 21 recognizable. To test the authenticity and reproducibility of these gaps and patterns, two resin-bound test peptides of known sequence, FRAPNββRM* and DFWYNββRM*
(where M* represents the homoserine lactone produced by CNBr cleavage), were synthesized corresponding to alleged library-derived sequences. Comparison of mass spectra of the known and alleged sequences revealed virtually identical patterns, confirming the indentities of Pro and Trp residues in these sequences. Figure 2.3 illustrates the MALDI-TOF spectra of Bz-OSu degradations of the known Pro- and Trp- containing peptides.
For Pro, gaps reproducibly included a group of peaks at mass-differences 94, 112,
126, and 128 units from the preceding amino acid peak. Analysis of this pattern begins with the assumptions that the secondary amine of Pro 1) fails to react with the OSu-ester, but does react with PITC, and 2) forms a relatively stable phenythiocarbamate (PC). The peak at 618 corresponds to the benzoylated peptide fragment Bz-NββRM*. The assumed failure of the N-terminal Pro to react with Bz-OSu is confirmed by the lack of a 715 peak.
However, the uncapped peptide NH2-PNββRM* should appear at a mass of 611 (715 minus 104, mass of the benzoyl group less 1 proton), but is also absent. If NH2-
PNββRM* reacted with PITC, the PC-PNββRM* peptide would appear at 746 (611 plus
135), which is present and accounts for the 128 mass-difference. The slow rate of PC-
Pro adducts toward cyclization and cleavage in the Edman degradation reaction is well
characterized, alternatively being referred to as “lag” or “carryover” since it appears in
the HPLC chromatograms of subsequent amino acids during cycling [55]. Additionally,
automated Edman sequencing is performed under highly optimized conditions that
include an anoxic atmosphere to prevent the substitution of oxygen for sulfur in the PC-
adduct (-16 mass units), a known occurrence for degradation reactions performed in non- 22 inert atmospheres. This mechanism accounts for the 730 peak (112 mass-difference).
Lastly, a loss of water (-18 mass units) accounts for the 712 peak (94 mass-difference).
Thus, all but the 126 mass-difference can be rationalized according to known aspects of
Edman chemistry.
For Trp-containing gaps, peaks were consistently observed 183, 199, and 215 mass units from the preceding peak. Tryptophan’s historical failings in the Edman degradation methodology are related to its proclivity for oxidation [56]. Moreover, various oxidized and doubly-oxidized species have been observed for tryptophan in mass spectra [57]. These oxidized Trp species display a characteristic step-wise +16 mass unit pattern beginning from the native Trp peak (Fig. 2.5). This same incremental +16 pattern is present in Trp gap of Figure 2.4B (964, 980, and 996, corresponding to the 1516, 1532, and 1548 peaks of Figure 2.5. However, the peak at 964 is three mass units shy of the expected mass of the Bz-WYNββRM* peptide fragment. By analogy to the situation involving Pro, in which the 94 mass-difference peak was also 3 mass units shy of the expected Bz-Pro fragment, one can easily extrapolate the situation to include two +16 side-chain oxidation events.
As a result of the high success rate (generally >90%) of the Nic-OSu PED technique applied to synthetic library sequencing, we began considering its applicatio to intact protein sequencing as a sensitive, inexpensive, and quick alternative to tandem mass spectrometric (MS/MS) and traditional Edman degradation methods. The real challenge was finding an immobilization strategy that would withstand repetitive treatments by acid, base, aqueous, and organic solvents without significant sample loss, while also releasing the peptides for analysis at the end. We decided covalent attachment was least likely to incur sample loss during chemical cycling, and a disulfide linkage 23 would allow for facile release for mass spectrometric analysis. Moreover, the N-terminus
required for degradation would be unaffected by this chemistry and the relative
infrequency of cyteine residues in proteins would prevent the overcrowding of the
spectrum.
To test this strategy, activated mixed-disulfide resins were synthesized using
DTNB and dithiopyridyl derivatives of hydrophilic solid supports. The micro-spherical
TentaGel support failed to capture analyte under all conditions attempted, but the
macroporous BioMac resin employing the less reactive dithiopyridyl disulfide yielded
some success. The Co2+-substituted peptide deformylase of E. coli (EcPDF) was selected as the trial protein because it contains two Cys residues and was available in abundance in our lab (Table 2.1). Conditions were found under which EcPDF was rapidly and efficiently de-natured and digested in solution by trypsin such that the cysteine thiols were not oxidized. After immobilization, several cycles of PED were performed and the peptide mixture was eluted from the resin by treatment with TCEP. An enhanced
MALDI-TOF signal-to-noise ratio was obtained if the TCEP salt was removed by ZipTIP
mini-C18 purification prior to submission (Fig. 2.6). Under the optimal conditions tested,
detection limits in the high picomolar-range were the lowest achieved for the cysteine-
containing fragment terminated by Arg. The Cys-containing fragment terminating in Lys
was not detectable in most experiments. The lack of detection of this peptide may have
resulted from the faster oxidation rate of the Cys residue of this peptide or because of the
lowered ionization efficiency of Lys relative to Arg. As a result of the relatively high
detection limit of the Arg peptide and the lack of detection for the other, this
methodology was not pursued.
24 2.9 Conclusion
The application of PED to support-bound peptides offers a rapid, inexpensive, and highly successful means of sequencing the large numbers positive beads expected from library screening. Moreover, the elucidation of discreet, individual sequences is an important aspect for the determination of multiple binding motifs and the discovery of contextual relationships among ligand residues. On nearly all levels then, this technique appears superior to pooled sequencing techniques.
An added bonus of the technique is the side-chain acylation of lysine residues with Nic-OSu. Because of the mass shift from 128 to 233 mass-difference units introduced via the side-chain, Lys can easily be distinguished from Gln. Thus, libraries can be even more homogenous in the display of full-length ligands during screening, since molecular weight degeneracy can be removed during sequencing rather than during synthesis. The replacement of chain-termination encoding during synthesis marks a potentially powerful improvement in library screening, since non-uniform coupling reactivities among amino acids can lead to disparities and biases in ligand presentation.
A potential limiting facet of the technique, however, may be its difficulty with Pro and
Trp residues. While both display reproducible patterns allowing their identification, sequences rich with multiples of either residue may become problematic. On the whole, post-screening peptide-ladder sequencing is an improvement over previous methods.
25
AA1-AA2-AA3-AAn
PITC
PTC- AA1-AA2-AA3-AAn
H+
PTH- AA1 AA2-AA3-AAn
PITC HPLC PTC- AA2-AA3-AAn
H+
PTH- AA2 AA3-AAn
PITC
HPLC
Scheme 2.1. Traditional Edman degradation. PITC, phenyisothiocyanate. PTC, phenylthiocarbamate. PTH, phenylthiohydantoin.
26
AA1-AA2-AA3-AAn
AA1-AA2-AA3-AAn
AA1-AA2-AA3-AAn PITC + 5% PIC
PC- AA1-AA2-AA3-AAn
PTC- AA1-AA2-AA3-AAn
PTC- AA1-AA2-AA3-AAn H+
PC- AA1-AA2-AA3-AAn
AA2-AA3-AAn PTH- AA1 AA2-AA3-AAn
PITC + 5% PIC discard
PC- AA1-AA2-AA3-AAn
PC- AA2-AA3-AAn
PTC- AA2-AA3-AAn
n cycles
PC- AA1-AA2-AA3-AAn mass PC- AA2-AA3-AAn spec PC-AA3-AAn PC-AAn
Scheme 2.2. Peptide ladder generation in solution phase via partial Edman degradation.
PITC, phenyisothiocyanate. PIC, phenylisocyanate. PTC, phenylthiocarbamate. PTH, phenylthiohydantoin. PC, phenylcarbamate.
27
Figure 2.1. MALDI-MS obtained by Chait et al. [50] detecting ~25 pmol total peptide
(<~5 pmol per species) illustrating the protein ladder sequencing concept. The spectrum represents the eight N-terminal amino acids of the 14-mer [Glu1] fibrinopeptide B.
28
Scheme 2.3. Generation of the peptide ladder on the solid support. (a) 3:2 PITC/Bz-NHS in 4:1 pyridine/water; (b) TFA. Bz, benzoyl. Subsequent cleavage of the peptides from the resin by CNBr allows analysis by MALDI-TOF MS.
29
Figure 2.2. MALDI-TOF mass spectrum of a peptide and its truncation products after cleavage from the support by CNBr. The sequence of the peptide is NH2-
NIEILNββRM*, where M* is homoserine lactone generated during CNBr treatment of
Met.
30
1063.85 845.69
992.79
732.58 I F A
1605.04 1306.94 1534.09 1421.00 1706.15 pY N I A T
80 1000 1200 1400 1600 1800 m/z
Figure 2.3. MALDI-TOF MS of a pY peptide demonstrating the lowered ionization efficiency of the N-terminus complicating full-length sequencing. The sequence of the peptide is NH2-TAINpYAFILNββRM*, where M* is homoserine lactone generated
during CNBr treatment of Met.
31
Figure 2.4. A. MALDI-TOF MS of the sequence FRAPNββRM*. B. MALDI-TOF MS of the sequence DFWYNββRM*. Although neither residue is often seen as its residue mass, both are easily recognized by their consistent, aleternative patterns. M* is the homoserine lactone generated from Met during CNBr cleavage from the resin.
32
Figure 2.5. The step-wise +16 tryptophan oxidation pattern detected by MALDI-TOF
MS by Bienvenut and colleagues [57].
33 A.
1 MSVLQVLHIP DERLRKVAKP VEEVNAEIQR IVDDMFETMY AEEGIGLAAT QVDIHQRIIV 61 IDVSENRDER LVLINPELLE KSGETGIEEG CLSIPEQRAL VPRAEKVKIR ALDRDGKPFE 121 LEADGLLAIC IQHEMDHLVG KLFMDYLSPL KQQRIRQKVE KLDRLKARA
B. Calculated Monoisotopic Mass: 19316.28
Mass Position Peptide Sequence 1536.8202 1-13 MSVLQVLHIPDER 288.2030 14-15 LR 147.1128 16-16 K 1581.8594 17-30 VAKPVEEVNAEIQR 3052.4390 31-57 IVDDMFETMYAEEGIGLAAT QVDIHQR 1157.6524 58-67 IIVIDVSENR 419.1885 68-70 DER 1280.7824 71-81 LVLINPELLEK 1804.8381 82-98 SGETGIEEGCLSIPEQR 555.3613 99-103 ALVPR 347.1925 104-106 AEK 246.1812 107-108 VK 288.2030 109-110 IR 3433.7242 111-141 ALDRDGKPFELEADGLLAICIQHEMDHLVGK 1226.6489 142-151 LFMDYLSPLK 431.2361 152-154 QQR 288.2030 155-156 IR 275.1714 157-158 QK 375.2238 159-161 VEK 403.2299 162-164 LDR 260.1968 165-166 LK 246.1560 167-168 AR 90.0549 169-169 A
Table 2.1. A. The amino acid sequence of EcPDF. B. Virtual tryptic digest of EcPDF
with calculated [M+H]+ isotopic masses of the fragments. The two Cys-containing peptides are highlighted with the Cys residues in bold.
34
1822.62
1765.61 1909.68
1535.53 1636.57 GS TE
1500 1600 1700 1800 1900 2000 m/z
Figure 2.6. MALDI-TOF MS demonstrating the N-terminal sequencing of 500 ng (~240 pmol) of EcPDF by PED conducted by reversible disulfide immobilization. Note the masses of the peptides are 105 units larger than [M+H]+ calculated in Table 2.1 due to the mass of the nicotinoyl cap.
35
CHAPTER 3
DETERMINATION OF THE PHOSPHOPETIDE LIGAND SPECIFICITIES OF
SHP-2 AND SHIP SH2 DOMAINS BY COMBINATORIAL PEPTIDE
LIBRARY SCREENING
3.1 Introduction
Protein-protein interactions are an integral component of many cellular processes such as intracellular signaling. Frequently, the interactions are mediated by modular domains, which can recognize small, specific peptide motifs in their partner proteins. The
Src homology 2 (SH2) domain was one of the first examples of such domains, which binds to specific phosphotyrosyl (pY) peptides [43-45]. A large number of SH2 domains are now known and it has been estimated that the human genome encodes at least 115
SH2 domains [58]. Each SH2 domain interacts with a unique subset of pY peptides and the sequence specificity is primarily determined by the three amino acids immediately C- terminal to pY. Since the initial discovery of the SH2 domain, some many additional types of modular recognition domains have been discovered (e.g., BIR, SH3, PDZ, FHA, etc.) [59]. However, for the majority of these domains, their sequence specificity or in vivo interaction partners are currently unknown.
One approach to sorting out the complex protein-protein interaction network is to determine the sequence specificity of these modular domains through the screening of 36 combinatorial peptide libraries and then use the consensus sequence(s) to search the protein databases. Several combinatorial methods have been reported. In their pioneering work with SH2 domains, Cantley and co-workers employed affinity columns containing an immobilized SH2 domain to enrich SH2-binding sequences from a pY peptide library [47], a technique later expanded upon by others [60]. Sequencing of the enriched peptide pool by conventional Edman degradation revealed the preferentially selected amino acid(s) at each position. A variation of this method involved screening support-bound libraries against a GST-SH2 domain and detecting with fluorescently lablelled αGST antibodies [61]. The positive beads with the bound SH2 were removed from the library using a fluorescence-activated bead sorter and all of the selected beads were pooled and sequenced by Edman degradation. As described in Chapter 2, this method of sequencing provides information on the most preferred amino acid(s) at each position but, importantly, does not give individual sequences. Since the method selects for both affinity and abundance of certain types of sequences, a high-affinity peptide of low abundance may not emerge from the single derived consensus sequence. A second method involves the iterative synthesis and screening of sub-libraries or “positional scanning” [5]. However, in addition to being highly labor intensivethis method suffers from the same drawbacks and lack of contextual sequence information as the first method. In a third phage display method, bacteriophage bearing short random peptide sequences were selected against an immobilized modular domain [15, 62, 63]. The sequences of the binding peptides were determined after iterative amplification and selection of the bound phage by DNA sequencing. This method is highly effective for modular domains that recognize unmodified peptides but generally does not work well with protein domains that recognize post-translationally modified peptides [63, 64]. Here 37 we describe another method, in which resin-bound peptide libraries are selected against a
protein receptor and the positive beads are removed from the library and individually
sequenced by partial Edman degradation, a high-throughput technique recently developed
by this laboratory. Our method gives a statistically significant number of individual
sequences, from which a consensus sequence(s) can be derived. This method is applied
to determine the sequence specificity of three SH2 domains from phosphatases SHP-2
and SHIP.
3.2 Experimental Procedures
3.2.1 Vector Constructs
Originally, unmodified pMAL-c2 (New England Biolabs, NEB) and pGEX-2T
(Pharmacia) constructs were used for the expression of fusion proteins for library
screening. However, pMAL-c2 and several additional vectors were modified to enhance
some aspect of affinity tag purification, solubility, expression, etc, relative to the original.
First, two very similar pET-MAL vectors, pET-MAL I and II, were created by sub-
cloning the malE gene by PCR from pMAL-c2 into pET-28a (Novagen) with the retention all pMAL multiple cloning sites (MCS) (Appendix Scheme A1). The malE gene and MCS fragment was amplified with the following primers: sense, 5’-CCG AAC
TCC ATA TGA AAA TCG AAG AAG GTA AAC TGG-3’; anti-sense for pET-MAL I,
5’-GGA CGC TCG AGA ACG ACG GCC AGT GCC AAG CT-3’; anti-sense for pET-
MAL II, 5’-GGA CGC TCG AGT AAG CTT GCC TGC AGG TCG AC-3’. PCR was
performed simultaneously for both sets of primers in 50 µL reactions containing 80 ng of
MidiPrep-quality (Qiagen) pMAL-c2 template, 0.5 µM primers, 0.2 µM dNTPs (0.2 µM
of each nucleotide), 1X BSA, 1X ThermoPol Reaction Buffer, and 1 U of Deep VentR 38 DNA polymenrase (NEB). Thermocycling was programmed on a Applied Biosystems
PCR System 2400 to include 1 cycle of 94 ºC (2’), 59 ºC (1’), 72 ºC (1’), and 19 cycles of
94 ºC (1’), 62 ºC (45”), 72 ºC (45”). The PCR products were purified by spin column kit
(Qiagen), digested at the underlined primer sites with Nde I and Xho I (NEB) endonucleases according to manufacture’s protocols, and re-purified by spin column.
MiniPrep purified (Qiagen) pET-28a vector was similarly digested with Nde I and Xho I with the addition of Calf Intestinal Alkaline Phoshatase (CIP) (NEB) and purified by spin column. After quantitation of the digest products by agarose gel electrophoresis, ligation was performed overnight at 16 ºC in 15 µL reactions containing 140 ng of vector, 5- and
10- fold excesses of insert (malE), 1X T4 DNA Ligase Reaction Buffer, and 600 U of T4
DNA Ligase (NEB). After transformation of 5 µL of each reaction into 50 µL of chemically competent XL1 Blue E. coli (Stratagene) and screening, glycerol stocks of each construct were preserved at -80 ºC. The pET-MAL II construct was deemed more ideal and was subsequently the only one of the two used for future sub-cloning.
Hereafter, all references to pETMAL constructs refer to the pET-MAL II variant.
Furthermore, the control MBP, which contains N- and C-terminal six-His tags, was expressed from this variant, see later.
The next modifications involved the PinPoint Xa-1 (pPNPT) vector from
Promega. This plasmid encodes a fragment of a biotin carrier domain (BCD) from
Propioninbacter freundii as an N-terminal fusion. This domain is efficiently biotinylated at a specific lysine residue in E. coli, thus allowing it to be used as an affinity purification tag as well as for library screening. However, it was decided that purification by immobilized metal affinity chromatography (IMAC) would be less expensive and of greater benefit. Therefore, 32-mer oligonucleotides encoding a six-His-tag and an Eco RI 39 restriction site were designed such that, upon annealing they formed Not I
complementary 5’-overhangs at both ends. However, upon ligation one of the ends
destroys the Not I site for future restriction. The oligonucleotides 5’-GGC CGC CAT
CAT CAT CAT CAT CAT TGA ATT CT-3’ and 5’-GGC CAG AAT TCA ATG ATG
ATG ATG ATG GC-3’ were phosphorylated by T4 Polynucleotide Kinase (NEB). After
heating 15 µL of 100 µM double stranded insert to 45 ºC for 5’ and cooling on ice,
phosphorylation was performed in 30 µL of 1X T4 DNA Ligase Buffer and 10 U of T4
Polynucleotide Kinase at 37 ºC for 1 hr. After heat denaturation of the kinase at 65 ºC for
20 min, the His6-tag was ligated as above into the PinPoint vector digested by Not I.
Following transformation into the XL1 Blue strain, a colony containing the additional
Eco RI site and the correctly oriented insert, pPNPT-6HIS, was detected by restriction
mapping and stored at -80 ºC as a glycerol stock.
No further characterization was performed on the pPNPT-6HIS construct because
a new scheme was undertaken. By transferring the biotin carrier domain to the pET-14b
vector (Novagen) it was possible to add an N-terminal His6-tag and remove lac operator control of induction for use with pBirA [65] co-expression in the BL21-AI E coli strain
(Invitrogen). The procedure for sub-cloning by PCR into pET-14b was generally the same as that described for pET-MAL (Appendix Scheme A2). The pET vector was first altered to remove several restriction sites. After digesting the vector with Eco RI, the sticky-ends were filled in by the addition of dNTPs to 100 µM and 1 U of DNA
Polymerase I, Large (Klenow) Fragment (NEB) for 12’ at room temp. Subsequent to agarose gel purification and extraction by Gel Extraction Kit (Qiagen), digestion by Eco
RV allowed blunt-end ligation, which removed the Eco RI, Hind III, and Eco RV sites from pET-14. The BCD and MCS were amplified from the pPNTP template by PCR 40 using the primers 5’-GGA ACC ACA TAT GAA ACT GAA GGT AAC AGT CAA C-3’
and 5’-GGA ATT CAC TAT AGA ACC AGA TCG CG-3’. Thermocycling consisted of one cycle of 94 ºC (2’), 60 ºC (1’), 72 ºC (1’), and 19 cycles of 94 ºC (1’), 63 ºC (45”), 72
ºC (45”). The PCR product was purified, digested by Eco RI, blunt-ended as above, spin column purified, and digested by Nde I. The pET vector was digested with Bam HI, blunt-ended, and purified before treatment with Nde I and CIP. The single sticky-end ligation proceeded smoothly as above. Sequence confirmation of this pET-PNPT construct by the dideoxy chain-termination method was preformed at the Plant Genome
Facility and a glycerol stock of XL1 Blue cells containing this construct was made for storage at -80 ºC.
Additionally, the hydrophilic linker and MCS from the pMAL vector were inserted C-terminally (Appendix Scheme A3). This required the removal of one of the
Xho I sites within the biotin carrier gene by incorporation of a silent mutation.
Introduction of a point mutation was accomplished using the QuikChange Site-Directed
Mutagenesis Kit (Strategene) and the primer pair 5’-CCG TGC TCG TTC TTG AGG
CCA TGA AGA TGG-3’ plus its complement. A 50 µL reaction containing 10 ng of
pET-PNPT template, 0.2 µM dNTPs, 0.16 µM each primer, 1X reaction buffer, and 2.5 U
of Pfu Turbo polymerase was thermocycled once for 95 ºC (30”), followed by 14 cycles
of 95 ºC (30”), 52 ºC (1’), 68 ºC (12’), and allowed to hold 68 ºC for a extra 12’. The
methylation-specific endonuclease Dpn I was added for 1 hr. at 37 ºC before
transformation. Sequencing confirmed the single mutation. Next, the linker and MCS
was amplified from the pETMAL template with the primers 5’-GGA TAT CGC AGA
CTA ATT CGA GC-3’ and the T7-terminator primer 5’-GCT AGT TAT TGC TCA
GCG G-3’ and digested with Eco RV and Xho I prior to blunt-ending by Klenow 41 treatment.. The mutated vector was digested with Xho I and Not I before the Klenow fragment filled in the overhangs. Blunt-end ligation proceeded at room temperature for
90 min rather than 16 ºC overnight. Sequencing confirmed the authenticity of the final product, dubbed pPPTmal vector. A glycerol stock of XL1 Blue cells containing pPPTmal was stored at -80 ºC.
In an attempt to conduct fluorescent-based library screenings, a green fluorescent
protein (GFP) fusion system was sought. From the Gopalan lab, we received what was
believed to be the pGFPuv vector distributed by Clontech. However, it was determined
by sequencing that the plasmid contained the GFPuv gene, but as a six-His-tagged fusion
in pET-29. The tag was useful, but work was necessary to make the vector usable for
domain fusion (Appendix Scheme A4). First, the removal of a stop codon preceding the
MCS and destruction of a Bam HI site within the coding region were affected in one step
by PCR. The primer 5’-CCT GCA AGA TCT GTT CAA CTA GCA GAC CAT TAT C-
3’ hybridized just 3’ of the Bam HI site of the gene, but encoded a Bgl II restriction
sequence in its place. The sticky-ends of the 5’ overhangs of these two enzymes are
cohesive, but after ligation cannot be re-cut by either enzyme. The primer 5’-CCA CGA
ATT CAT CCA TGC CAT GTG TAA TCC-3’ contained an Eco RI site and hybridized a
few bases upstream of the vector’s own Eco RI site, excluding the stop codon. Thus, the
pET29-GFPuv plasmid served as the template for PCR amplification of a fragment
containing a Bam HI to Bgl II site switch and devoid of a stop codon. The fragment was
digested with Bgl II and Eco RI while pET29-GFPuv was digested with Bam HI and Eco
RI with de-phosphorylation by CIP. After ligation, sequencing confirmed restriction site destruction by a single silent C→T substitution and removal of the termination codon upstream of the MCS. 42 The GFP vector was now suitable for the construction of domain fusions for screening, but the GFPuv gene had been optimized for excitation by 354-nm light
sources, hence the “UV” moniker. Since our fluorescent microscope’s excitation
wavelength was ~488-nm, several mutations were made to increase excitation at this
wavelength in order to match our equipment. The following mutations were introduced
by QuikChange SDM Kit as above because they had been described to enhance the 488-
nm induced fluorescence and/or the solubility/folding of GFP: S72A, I167T, F64S, S65T,
and V68L. In addition, one of the Nde I sites was destroyed by silent mutation. The
primers (along with their implied complementary oligos) used for these mutations were,
consecutively: Nde I 5’-GTT ATCNCGG ATC ACA TGC AAC GGC ATG-3’; S72A 5’-
GGT GTT CAA TGC TTT GCC CGT TAT CCG GAT C-3’; I167T 5’-GCT AAC TTC
AAA ACC CGC CAC AAC ATT GAA G-3’; F64L,S65T,V68L 5’-CCA ACA CTT
GTC ACT ACT TTG ACT TAT GGT CTT CAA TGC TTT GCC-3’. The long triple
mutation primer was purified by 12% Urea-PAGE electrophoresis according to the protocol described in Chapter 5. Thermocycling for all reactions was 95 ºC (30”), followed by 16 cycles of 95 ºC (30”), 51 ºC (1’), 68 ºC (14’), and allowed to hold 68 ºC for a extra 14’, except for the triple mutation reaction, in which case 18 cycles were performed. Dpn I treatment preceded transformation and sequencing for each mutation.
Finally, in order to add the hydrophilic linker and MCS from pMAL C-terminal to
GFP, an Eco RV restriction site was introduced by QuikChange SDM one nucleotide 5’ of the Eco RI site of the endogenous MCS. The primer 5’- GGG ATT ACA CAT GGG
ATA TCT GAA TTC ACT ATG G-3’ and its complement were themocycled as for the
triple mutation above in 50 µL reactions. After confirming the mutation by restriction
mapping, the pMAL linker was amplified by PCR exactly as before and ligated at the Eco 43 RV and Hind III sites of the pGFP vector. After confirmation by dideoxy sequencing the
XL1 cells harboring this pGFPmal construct were preserved at -80 ºC as a glycerol stock.
3.2.2 SHP-2 SH2 Domain Constructs
The DNA sequences coding for SHP-2 N-SH2 domain (aa 1-106), C-SH2 domain
(aa 108-220), and SHIP SH2 domain (aa 1-109) were isolated by PCR from pET22-SHP2
[66], pBS-SHP2 [67], and pGEX2T-SHIP(SH2) [68] plasmid templates, respectively.
The DNA primers used were: N-SH2, (T7 promoter primer) 5’-TAA TAC GAC TCA
CTA TAG GG-3’ and 5’-AGA TTA GAA GCT TTC AAT CTG CAC AGT TCA GAG
GAT ATT TAA GC-3’; C-SH2, 5’-ATA TAG AAT TCA TGA CCT CTG AAA GGT
GGT TTC ATG GAC A-3’ and 5’-AGA TTA GAA GCT TTC AAC GAG TCG TGT
TAA GGG GCT GCT-3’; SHIP SH2, 5’-GCG AAT TCA TGC CTG CCA TGG TCC
CTG G-3’ and 5’-CGT CCA AGC TTC ACT CCT CCT CCA GGG GCA C-5’. The
PCR products of the C-SH2 and SHIP domains were digested with the restriction endonucleases Eco RI and Hind III, and ligated into their corresponding sites in pMAL-
c2. However, because the primer 5'-ATA TAG AAT TCA TGA CAT CGC GGA GAT
GGT TTC A-3’ repeatedly failed in the amplification of the N-SH2 domain, and was shown to be incapable of amplification when paired with the proven anti-sense primer of the C-SH2 domain, an alternative strategy was pursued for the sub-cloning of this domain. The T7 promoter primer hybrized upstream of the SHP-2 gene in the pET22-
SHP2 construct and also incorporated an in-frame Nde I site from the vector.
Fortunately, an in-house pMAL-c2’-SHPTP vector contained an in-frame 5’ Nde I sequence. Unfortunately, the site is not unique within the plasmid and therefore, partial
digestion was required to isolate the desired recipient vector. After 90 min of Hind III 44 linearization of 6.0 µg of pMAL-c2’-SHPTP vector in 100 µL, 20 µL aliquots were
removed and treated with Nde I for 15. 20, 25, and 30 min intervals before heat de- naturating (70 ºC for 20’). De-phosphorylation by CIP and agarose gel purification of the
~6.2 kb fragment delivered the correct recipient vector. The PCR product was double digested by Nde I and Hind III and ligated to pMAL-c2’. Each construct, pMAL-NSH2, pMAL-CSH2, and pMAL-SSH2, was confirmed by restriction mapping.
Subsequently, each domain was sub-cloned from the above pMAL templates to the pETMAL vector. Agarose gel purified Eco RI/Hind III digestion fragments from
each pMAL-domain fusion were ligated to reciprocating pETMAL sticky-ends, yielding
the respective His6-MBP-domain fusions. Furthermore, pPPTmal and pGFPmal fusions
were generated by PCR-based sub-cloning from the pETMAL-domain templates using
the malE sequencing primer (5’-GGT CGT CAG ACT GTC GAT GAA GCC-3’) and the
T7-terminator primer. Since all constructs had been designed to share the same reading frame and MCS, digestion and ligation at Eco RI/Hind III were universally applicable.
Lastly, simple N-terminal His6-tag fusions were introduced to each SH2 domain
by sub-cloning into the pET-28a vector. These smaller constructs were for use in surface
plasmon resonance (SPR) experiments. The same scheme as for the pPPTmal and
pGFPmal constructs was employed. Namely, the malE and T7-terminator sequencing
primers amplified each domain, which was subsequently ligated to the in-frame Eco
RI/Hind III sites of pET-28a. The His6-domain fusions were sequenced from this
construct, confirming the identities of their predecessor pETMAL constructs.
45 3.2.3 Control and SHP-1 Constructs
The pMAL-∆lacZ, pET28-NSH2(SHP-1), and pET28-CSH2(SHP-1) plasmid were generated by Kirk Beebe. The pMAL-NSH2(SHP-1) and pMAL-CSH2(SHP-1) fusions have been previously described [69, 70]. The ∆lacZ plasmid was the source of the MBP control lacking a domain fusion. The pETMAL vector encoded its own stop codon after the C-terminal His-tag, thus its MBP product contained two His-tags.
Control constructs for pPPTmal and pGFPmal were generated by in a fashion similar to that of pMAL- ∆lacZ, namely, linearized vector was treated with Klenow fragment and re-ligated in order to hasten the occurrence of a stop codon in the reading frame. For pPPTmal the restriction sites at which this was performed were Eco RI/Xho I, while for pGFPmal Hind III was used. In each case, the stop codon was reached within 7 amino acids.
3.2.4 Purification and Biotinylation of His6-MBP-SH2 Proteins
E. coli BL21(DE3) cells harboring the proper pETMAL-SH2 plasmid were grown in LBK medium + glucose (2 g/L) to the mid-log phase and induced by the addition of
300 µM isopropyl-β-D-thiogalactoside (IPTG) for 2.5 h at 30 ºC. The cells were harvested by centrifugation and lysed in Buffer 7 by passing through a French press.
Each MBP-SH2 protein was purified from the crude lysate on an amylose column according to manufacturer’s recommended procedures (NEB). The protein was eluted from the column in Buffer 8, concentrated in an Amicon stirred-cell concentrator to approximately 4 mg/mL, and treated with 2 equivalents of Biotin-OSU (dissolved in
DMF) at room temperature for 45 min. Excess biotin was removed by passing the solution through a Sephadex G-25 column equilibrated in Buffer 9. After concentration 46 and addition of glycerol (final 40%), the protein was quickly frozen in a dry
ice/isopropanol bath and stored at -80 ºC.
For BIAcore work, the above procedure was performed without the addition of
the biotin label. Rather, the concentrated protein was polished and buffer exchanged into
Buffer 10 by passage in 4 mL aliquots through a size exclusion column (XK-16
Superdex-75) connected to an FPLC system (Pharmacia) at a flow rate of 1.1 mL/min.
After concentration by centrifugal ultrafiltration (Millipore), all proteins were flash
frozen without the addition of glycerol.
3.2.5 Purification of His-tagged SH2 Domains
N-terminally histidine-tagged SH2 domains were expressed in the Rosetta
CodonPlus strain of E. coli BL21(DE3) cells (Novagen). Protein expression was induced
by the addition of 300 µM IPTG to mid-log phase cells and incubation at 30 ºC for 3 h.
Cells were harvested by centrifugation and lysed in a French pressure cell in Buffer 11.
Crude lysate was loaded onto a Talon cobalt affinity column (10 mL). After washing
with 10 column volumes of Buffer 12, the SH2 protein was eluted with Buffer 13 and exchanged through a Superdex-75 column connected to an FPLC system into Buffer 10.
After concentration by centrifugal ultrafiltration, each protein was flash frozen in the
absence of glycerol.
3.2.6 Purification of Full-Length SHP-2 for Stimulation Assays
The full-length SHP-2 protein was expressed and purified in a manner very
similar to that described in the literature [66]. Specifically, E. coli Rosetta CodonPlus
BL21(DE3) cells harboring the pET22-SHP2 plasmid were grown in 4 L of LBA 47 medium to the mid-log phase and induced by the addition of 300 µM IPTG for 3.5 hr at
30 ºC. The cells were harvested by centrifugation and lysed in Buffer 14 by passage
through a French press. The lysate was applied to a Q-Sepharose Fast Flow (Pharmacia) column (2.5 x 10 cm) equilibrated with Buffer 15 and washed with 2 column volumes of
Buffer 15 prior to the development of an elution gradient (100% Buffer 15 to 100%
Buffer 16) in 300 mL. Activity towards p-nitrophenylphosphate (pNPP) was detected in the flow-through, wash, early part of the gradient. These fractions were pooled (~500 mL total volume) and the pH was adjusted by the slow addition of 75 mL of a 100 mM
MES, pH 5.6 solution. The entire volume was loaded onto an SP-Sepharose Fast Flow
(Pharmacia) column (2.5 x 15 cm) equilibrated with Buffer 17 and washed with 4 column volumes of the same. Elution was performed by the development of a gradient (100%
Buffer 17 to 100% Buffer 18) in 300 mL. Activity was detected in fractions near the middle-end of the gradient. These fractions were pooled and concentrated to ~35 mL in an Amicon stirred-cell pressure concentrator. After buffer exchanging ~10 mL aliquots into Buffer 17 by passage through G-25 columns, 10 mL aliquots were injected at 2.5 mL/min by Superloop onto a Mono S HR 10/10 column connected to an FPLC pump system (Pharmacia). Gradient elution between Buffers 17 and 18 in 250 mL adequately resolved SHP-2 (see Appendix Fig. A3.1), which was stored at -80 ºC in the presence of
~40% glycerol.
3.2.7 Synthesis of pY Library
The library was synthesized on 5 g of 90-µm TentaGel S NH2 resin using standard
Fmoc chemistry employing HBTU/HOBt/DIPEA as the coupling reagents. The invariant
positions (LNββRM, pY, and N-terminal TA) were synthesized with 4 equiv of Fmoc- 48 amino acids and the coupling reaction was terminated after ninhydrin tests were negative.
The random positions were synthesized using the split-synthesis method [4, 6]. The
coupling reactions employed 5 equiv of Fmoc-amino acids and proceded for 45 min, after
which time the coupling reaction was repeated to insure complete reaction. To facilitate
sequence determination by mass spectrometry, 5% Ac-Gly was added to the coupling
reactions of Leu and Lys, whereas 5% Ac-Ala was added to the coupling reactions of Nle
[71]. After removal of the terminal Fmoc group, the resin-bound library was washed
with dichloromethane and deprotected using Reagent K (defined in Chapter 2) at room
temperature for 60 min. The resultant NH2-TAXXpYXXXLNββRM-resin library
(hereafter referred to as pY library) was washed with TFA, DCM, and MeOH before
drying for storage at –20 ºC.
3.2.8 Colorimetric Library Screening
In a micro BioSpin column (0.8 mL, BioRad), 100 mg of the pY library was
swollen in dichloromethane, washed extensively with methanol, ddH2O, and Buffer 19,
and blocked for 1 h with 800 µL of Buffer 19 containing 0.1% gelatin. The resin was drained and resuspended in 800 µL of a biotinylated MBP-SH2 domain of interest (10–50
nM final concentration) in Buffer 19 plus 0.1% gelatin. After overnight incubation at 4 ºC
with gentle mixing, the resin was drained and re-suspended in 800 µL of Buffer 20
containing 1 µL of streptavidin-alkaline phosphatase (SA-AP, ~1 mg/mL, Prozyme).
After 10 min of gentle mixing at 4 ºC, the resin was rapidly drained and washed with 400
µL of Buffer 20, 400 µL of Buffer 21, 400 µL of Buffer 19, and again with 400 µL of
Buffer 21. The resin was then transferred into a 35-mm Petri dish in 5 x 300 µL of Buffer
21. Upon addition of 80 µL of 5 mg/mL BCIP in Buffer 21, intense turquoise color 49 developed on positive beads in ~45 min, at which point the staining reaction was quenched by the addition of 3 mL of 8 M guanidine-HCl, pH 8.0. The resin was transferred back into the BioSpin column, extensively washed with water, and re-plated in the Petri dish from which colored beads were picked manually using a pipette under a dissecting microscope. The positive beads were sorted by color intensity into “intense”,
“medium”, and “light” categories. Control experiments with biotinylated MBP produced no colored beads under identical conditions.
3.2.9 Partial Edman Degradation and Peptide Sequencing
Positive beads were pooled according to color intensity and subjected to partial
Edman degradation in a procedure similar to that of Chapter 2, but with a few optimizations for the pY library. The beads were suspended in 66% pyridine (aq) containing 0.1% Et3N, to which was added an equal volume of 5% PITC in pyridine containing a variable amount of Nic-OSU. After rapid mixing, the reaction was allowed to proceed for 6 min. The beads were washed with methanol, dichloromethane, and TFA and suspended in TFA (2 x 6 min). After extensive washing with dichloromethane and pyridine, the cycle was repeated. An optimized procedure was established for this library by trial and error using unselected beads which employed varying PITC/Nic-OSU mole ratios as follows: 6:1 for the N-terminal T and A; 4.5:1 for the N-terminal random positions; no Nic-OSU during pY degradation; and 5:1 for the C-terminal random positions. Finally, the linker sequence was capped by Nic-OSU in the absence of PITC.
The beads were then treated for 20 min with ~1 mL of TFA containing NH4I (10 mg) and
Me2S (20 µL) on ice to reduce any oxidized methionine. The beads were washed with ddH2O, placed in individual microcentrifuge tubes, and treated overnight in the dark with 50 20 µL of 70% TFA containing CNBr (20 mg/mL). After evaporating to dryness, the
peptides were dissolved in 5 µL of 0.1% TFA in water. One µL of the peptide solution
was mixed with 2 µL of 0.1% TFA in 50% acetonitrile saturated with 4-hydroxy-α-
cyanocinnamic acid and spotted onto a 96-well MALDI-TOF sample plate. MALDI-
TOF mass spectrometry was performed on a Bruker Reflex III instrument in an
automated manner. Sequence determination from the mass spectra was performed
manually.
3.2.10 Synthesis of Biotinylated pY Peptides
All pY peptides contained a common C-terminal linker, -LNBKR-NH2. Each
peptide was synthesized on ~65 mg of CLEAR-amide resin using standard
Fmoc/HBTU/HOBt chemistry. The N-terminus was acetylated by the treatment of Ac2O.
Cleavage and de-protection were carried out as previously described. Approximately 3
mg of the crude peptide was dissolved in a minimal volume of DMSO (300–500 µL, with
sonication) and reacted with 1 equiv of NHS-PEG4-biotin (Quanta BioDesign, Ltd.) in 25
µL of DMSO. After 45 min at room temperature, the mixture was triturated twice with 20
volumes of Et2O. The precipitate was collected and dried under vacuum. The biotinylated pY peptide was purified by reversed-phase HPLC on a C18 column (Vydac 300Ǻ 4.6 x
250 mm). The identity of each peptide was confirmed by MALDI-TOF mass spectrometric analysis. This procedure resulted in the addition of a 15-atom hydrophilic linker between the side chain of the C-terminal lysine and the carboxyl group of biotin.
51 3.2.11 Determination of Dissociation Constants by BIAcore
All measurements were made at room temperature on a BIAcore 3000 instrument.
A sensorchip containing immobilized streptavidin was conditioned with 1 M NaCl in 50
mM NaOH (aq) according to manufacturer’s instructions. The biotinylated pY peptides
were immobilized onto the sensorchip by flowing 6 µL of ~8 µM pY peptide solution in
HBS-EP Buffer purchased from BIAcore. Initial studies using the MBP-fusions proved unreliable, and thus, sensorgram data for the secondary plot analysis were acquired by passing increasing concentrations (0–5 µM) of a His6-SH2 protein in HBS-EP Buffer over the sensorchip for 2 min at a flow rate of 15 µL/min. A blank flow cell (no immobilized pY peptide) was used as control to correct for any signal due to the solvent
bulk and/or nonspecific binding interactions. In fact, neither significant bulk effect nor
nonspecific binding was observed. In between two runs, the sensorchip surface was
regenerated by treatment with Strip Buffer for 5–10 s at a flow rate of 100 µL/min. The
equilibrium response unit (RUeq) at a given SH2 protein concentration was obtained by subtracting the response of the blank flow cell from that of the sample flow cell. The dissociation constant (KD) was obtained by fitting the data to the equation,
RUeq = RUmax[SH2]/(KD + [SH2])
where RUeq is the measured response unit at a certain SH2 protein concentration and
RUmax is the maximum response unit.
52 3.3 Results
3.3.1 Library Design, Synthesis, and Screening
To demonstrate the effectiveness of the combinatorial method, we chose to
determine the sequence specificity for the SH2 domains of protein tyrosine phosphatase
SHP-2 and inositol phosphatase SHIP. SHP-2 and a structurally similar phosphatase,
SHP-1, belong to a subfamily of PTPs which each contain two SH2 domains N-terminal
to their catalytic domain, whereas SHIP contains a single SH2 domain. All three proteins
are involved in a variety of signaling pathways [72]. Despite their sequence homology,
SHP-1 and SHP-2 have very different in vivo functions. For example, SHP-2 generally
acts as a positive regulator for the various signaling pathways, whereas SHP-1 primarily
acts as a negative regulator of signaling events [72]. Some studies show that SHP-1,
SHP-2, and SHIP recognize distinct pY motifs on various receptors via their SH2
domains, while others report that the three enzymes can compete for binding to a common receptor bearing one or more immunoreceptor tyrosine-based inhibition motifs
(ITIMs) [73]. These data suggest that the SH2 domains in SHP-1, SHP-2, and SHIP have distinctive but partially overlapping specificities. Therefore, a detailed study on their sequence specificities would be very helpful in identifying their physiological targets and determining their cellular functions.
The specificity of an SH2 domain is primarily determined by the pY residue and the three residues immediately C-terminal to pY [74-76], although it has been reported
that, for a few SH2 domains including those of SHP-1 and SHP-2, the -2 position (2
residues N-terminal to pY, which is position 0) is also important for high-affinity
interaction [77, 78]. Thus, we designed a pY library, H2N-TAXXpYXXXLNββRM-
resin, where X represents norleucine (Nle) or any of the 18 natural amino acids except for 53 Met and Cys and β is β-Ala. The N-terminal dipeptide TA helps reduce potential bias
caused by electrostatic interactions between an SH2 protein and the free N-terminus
(which is required for peptide sequencing). At the C-terminus, a methionine permits the
release of peptides from the resin by CNBr treatment prior to sequencing, while arginine
serves to increase peptide solubility and sensitivity during MALDI-MS sequencing by
providing a fixed positive charge. The two β-alanines add flexibility to the peptides,
making them more accessible to a protein target. The dipeptide LN is added to shift the
masses of the peptides to >600 Da, so that their mass spectral peaks do not overlap with
matrix signals (vide infra). Methionine is excluded from the randomized positions to
avoid internal cleavage during CNBr treatment, and is replaced by its isosteric residue
norleucine. The library was synthesized on TentaGel S NH2 resin (~90 µm in diameter and ~2.86 × 106 beads/g) using the split-pool method [4, 6] with each bead carrying ~100
pmol of a unique sequence. This method ensures equal representation of all possible
sequences in the library.
The theoretical diversity of the above library is 195 or 2.5 × 106 and therefore, in principle, 1 g of resin-bound library covers the entire sequence space. A typical screening involved ~100 mg of resin, to which was added a small amount of an SH2 domain protein (10–50 nM final concentration), constructed as an MBP fusion protein and biotinylated on a surface lysine residue(s). Binding of the biotinylated SH2 domain to a resin-bound pY peptide recruits a streptavidin-alkaline phosphatase conjugate to the surface of that bead. Upon the addition of BCIP, the bound alkaline phosphatase cleaves
BCIP into an indole, which dimerizes to form a turquoise precipitate deposited on the bead surface. As a result of this reaction cascade, beads carrying high-affinity SH2
54 ligands become colored. The number of colored beads depends on the binding affinity
and specificity of the protein domain as well as the stringency of the screening conditions
(e.g., SH2 domain concentration, number of washings, and length of staining time). The screening reactions were controlled so that 10~100 colored beads were obtained from 100 mg of resin (~280,000 beads). The number of positive beads was quite reproducible for
all of the SH2 domains we have studied as well as between different screenings against
the same SH2 domain. Positive beads were manually removed from the library using a
micropipette with the aid of a dissecting microscope.
3.3.2 Peptide Sequencing by PED
The development of a generally applicable, highly successful sequencing
technique dubbed “partial Edman degradation” was described in Chapter 2. After a brief
optimization of the conditions for this specific library, that technique was applied with
little modification to the sequencing of the numerous beads obtained from screening the
minimally-encoded pY library above with a high rate of success (>90%). Figure 3.1
shows an example mass spectrum, derived from a single bead carrying the peptide
sequence TA(Nle)YpYATILNBBRM. Note that the isobaric residues Nle, Leu, and Ile
are unambiguously resolved in the spectra by their appearance as a singlet (Ile) vs doublet
peaks (Nle and Leu).
3.3.3 Specificity of the C-SH2 Domain of SHP-2
Screening of the above library (100 mg) against 10 nM C-terminal SH2 domain of
SHP-2, MBP-CSH2, resulted in 14 intensely colored beads, 12 lightly colored beads, and
53 beads of intermediate color intensity. Since all of the beads were treated in the same 55 manner, the color intensity of a bead should correlate with the binding affinity of the pY peptide on the bead for the SH2 domain used in the screening. The three groups of beads
(79 total) were placed in 3 separate vessels and subjected to partial Edman degradation followed by MALDI-TOF analysis. Out of the 79 samples, 77 produced high-quality spectra, allowing for unambiguous determination of their peptide sequences (Table 3.1).
The mass spectra for the remaining two beads had one or more peaks missing, preventing complete sequence assignment. Construction of a histogram of the selected sequences reveals general trends, such as the strongly preference for a nonpolar aliphatic residue at the +3 position, with isoleucine being the most preferred amino acid (present in 57 selected peptides), followed by valine (present in 15 peptides) and leucine (present in 5 sequences) (Figure 3.2). The +1 position has the second most stringent requirement, strongly preferring an alanine (present in 46 peptides) or other small amino acids such as serine (present in 18 sequences), threonine (present in 10 sequences), and valine (present in 3 sequences). The –2 position is also critical for binding to the C-SH2 domain of SHP-
2, preferring a β-branched amino acid such as threonine, valine, and isoleucine, which are occasionally replaced by a tyrosine. There is a weak preference for a β-branched residue at the +2 position and virtually no selectivity at the –1 position.
To test whether the library screening result is reproducible, the above experiment was repeated with 50 nM MBP-CSH2 protein under otherwise identical conditions.
Ninety intensely colored and ~150 less colored beads were obtained and sequenced (see
Appendix Table A3.1 for sequences). The plot of the positional frequency of appearance for each amino acid (based on the 90 intensely colored beads) produced a pattern indistinguishable from that derived from the 10 nM screening (Figure 3.2). Inspection of the histograms in conjunction with the individual sequences allows us to draw the 56 following conclusions. First, SHP-2 C-SH2 domain recognizes a single consensus sequence (T/V/I/y)XpY(A/s/t/v)X(I/v/l), where lower case letters represent less frequently selected residues and X is any amino acid except for glycine and proline.
Second, the screening method is highly reproducible and robust. Finally, one can unambiguously determine the sequence specificity of an SH2 domain by screening just a fraction of the complete library (~10% in this case), because not all of the randomized positions are crucial for SH2 binding. The same conclusion (the validity of using incomplete libraries) was also borne out of our earlier work with FHA domains [79].
This greatly reduces the cost and time required for the characterization of each SH2 domain.
3.3.4 Specificity of the N-SH2 Domain of SHP-2
Initial screening of 100 mg of the library against 10 nM SHP-2 N-SH2 domain gave rather surprising results; the N-SH2 domain appeared to bind pY peptides of several distinct classes. To obtain additional sequences for more reliable statistical analysis, the screening experiment was repeated twice, once at 10 nM and another at 50 nM N-SH2 protein. Again, the results were highly reproducible, with all three screenings producing the same types of sequences. All together, 150 intensely colored beads were selected from 300 mg of the library and their sequences are listed in Table 3.2 (the most colored beads from 10 nM screenings are shown in boldface). Additional sequences from less colored beads are listed in Appendix Table A3.2. Clearly, the selected sequences fall into five distinct classes. The most abundant class (class I) has a consensus sequence of
(I/L/V)XpY(T/V/A)X(I/V/L), which is similar to that of the C-SH2 domain, albeit with some subtle differences at the -2 and +1 positions (Figure 3.3A). First, although the C- 57 SH2 domain most prefers threonine at the –2 position, threonine is rarely (twice) found in the N-SH2-binding peptides. Also, while leucine is seldom selected at –2 position by the
C-SH2 domain, it is the second most preferred residue at this position for the N-SH2 domain. Another difference is at the +1 position; while the C-SH2 domain strongly prefers alanine to serine, threonine, or valine, the N-SH2 domain selects for threonine, valine, and alanine with approximately equal frequency (but not serine).
The second most abundant class of peptides (Class II) has the consensus of
W(M/v/t/s)pYX(I/l/t/y)X, where the –2 residue is always a tryptophan and the –1 position is usually norleucine, valine, threonine, or serine (Fig. 3.3B). Remarkably, while the +2 position is highly variable among class I peptides, it is the most invariant position on the
C-terminal side of pY for class II peptides. The identity of the preferred residues (Ile,
Leu, Thr, and Tyr) suggests that the +2 side chain is engaged in hydrophobic interactions with the SH2 domain. Consistent with this binding mode, the selected +1 and +3 residues contain predominantly hydrophilic (e.g., Arg, Gln, and Thr) or small side chains, suggesting that they presumably face the solvent. The remaining three classes were less frequently selected and have the consensus sequences of
(I/V/L/T)XpY(L/M/T)Y(A/S/P/M) (class III), (I/V/L/T)XpY(F/M)XP (class IV), and
(L/I/V/T)XpY(M/V)(I/L/V/S)F (class V) (Table 3.2, Appendix Fig. A3.3). To the best of our knowledge, this is the first SH2 domain known to recognize multiple distinct consensus sequences. It is worth noting that all of the pY proteins currently known to bind to SHP-2 N-SH2 domain employ class I motifs (vide infra). It remains to be seen whether nature also uses the class II to V sequences as alternative SHP-2-binding motifs.
58 3.3.5 Specificity of SHIP SH2 Domain
The SH2 domain from SHIP was screened against the pY library at two different
SH2 protein concentrations (10 and 50 nM) and the peptide sequences from the 158 most intensely colored beads were used in statistical analysis (Table 3.3). SHIP SH2 domain binds to pY peptides of the consensus pY(Y/S/T/v)(L/y/nle/f/v)(L/Nle/i/v) (Figure 3.4).
Its specificity overlaps with those of SHP SH2 domains but also has a number of unique features. First, on the N-terminal side of pY, SHIP SH2 domain does not require specific residues for high-affinity binding, although among the selected sequences there appears to be a higher-than-expected number of small residues (e.g., Gly, Pro, and Ala) at the –2 and –1 positions. Second, high-affinity binding to SHIP SH2 domain requires a hydrophobic residue at the +2 position, with leucine being the most preferred, followed by tyrosine, norleucine, phenylalanine, and valine. The latter feature had previously been noted by other investigators [80]. Third, while alanine is among the most preferred amino acids at the +1 position for SHP SH2 domains, none of the SHIP SH2-binding sequences including those derived from lightly colored beads (see Appendix Table A3.3) had an alanine at the +1 position.
3.3.6 Affinity Measurements of Selected Sequences
Representative peptides from each consensus group were re-synthesized individually and tested for binding to the five SH2 domains of SHP-1, SHP-2, and SHIP using the surface plasmon resonance technique (BIAcore) (Figure 3.5). All of the pY peptides tested bound to their cognate SH2 domains with high affinity (KD = 0.2–9.7 µM)
(Table 3.4). As expected, a peptide generally binds with the highest affinity to the SH2 domain used in its selection. For example, peptide IHpYLYA was selected against the 59 N-terminal SH2 domain of SHP-2 (class III). It binds to the N-SH2 domain with a KD value of 0.28 µM but interacts with the other four SH2 domains of SHP-1, SHP-2, and
SHIP with 23–90-fold lower affinity. Likewise, peptide PFpYSLL binds to the SHIP
SH2 domain (which selected the former in the screening) with high affinity (KD = 0.20
µM) but much less tightly to the other four SH2 domains (KD = 5.6–14 µM). Some
peptides (e.g., LVpYATI), however, can associate with all five SH2 domains with similar
KD values (2–5 µM), consistent with the previous observation that the five SH2 domains
have overlapping sequence specificities. Thus, all of the pY peptides tested by BIAcore
bound to their cognate SH2 domains with high affinity. To test whether the color
intensity of a bead during screening correlates with the binding affinity of the peptide it
carries, we synthesized and tested peptides TIpYATI and NApYATI, which were both
selected by SHP-2 C-SH2 domain but the corresponding beads were intensely and lightly
colored, respectively. The former has a 6.5-fold higher affinity to the C-SH2 domain.
Note that the latter peptide contains an Asn at the –2 position, which is not among the
preferred amino acids at this position.
3.3.7 Database Search of Potential SHP-1/SHP-2-Binding Proteins
We performed a web-based search of human proteins that contain the tandem
consensus sequence motifs, (VIL)XY(ASTVI)X(ILV)X1–50(TVIY)XY(ASTV)X(IVL),
where X is any amino acid (web site: http://pir.georgetown.edu/). The two motifs were
designed to encompass the N- (class I) and C-SH2 domain consensus sequences of both
SHP’s, and were separated by anywhere from 1 to 50 residues. This search resulted in
420 “hits”, representing ~100 unique human proteins (many proteins appeared multiple
times under different names or as fragments). After discarding those proteins that are 60 obvious “false positives” (e.g., secreted proteins or transmembrane proteins with the
consensus motifs in the extracellular environment), we obtained 76 proteins as potential
SHP-1/SHP-2 targets (Table 3.5). Out of the 76 candidate proteins, 25 have previously
been shown to bind to SHP-1 and/or SHP-2 via their SH2 domains [81-111]. Since SHP-
1 and SHP-2 can also bind pY proteins through a single SH2 domain, similar searches
were conducted with single consensus motifs. Several additional proteins that are known to bind SHP-1 and/or SHP-2 were identified, such as PD-1 [112], death receptor [113], leptin receptor [114], insulin receptor substrate-1 [115], and Siglec-10 [116], among others. It remains to be determined whether the other predicted proteins in Table 5 are bona fide SHP-1 and SHP-2 binding partners in cellular systems.
3.4 Discussion
The combinatorial library method reported in this work has for the first time provided a complete solution to the problem of identifying linear peptide motifs that interact with a given protein or non-protein receptor. Compared to the previously reported methods, our method has several significant advantages. First, our method identifies individual binding sequences; this feature is crucial for understanding the specificity of receptors that recognize multiple consensus sequences. For example, when the five classes of binding sequences of SHP-2 N-SH2 domain were combined and plotted in the same manner as in Figure 2 to give a composite histogram, no clear consensus emerged (see Appendix Figures A3.2). Even for a receptor that has a single consensus sequence, individual sequences are useful in revealing subtle covariance of sequences. For example, among the pY peptides that bind to SHP-2 C-SH2 domain, 61 when the +3 residue is isoleucine, alanine is most frequently found at the +1 position;
however, when valine is the +3 residue, a serine is most preferred at the +1 position
(Table 3.1). Second, our method allows for “fair” competition among all library
peptides, as each bead contains roughly the same amount of peptide molecules (~100
pmol). This is not the case with pY peptide libraries displayed on phage surface, because
such libraries are biased against sequences that are poor substrates of the tyrosine kinases
used to phosphorylate the phage [63, 64]. Youngquist et al. reported another method in
which the peptide sequence on each bead is encoded by generating a set of chain-
termination products during library synthesis and the sequence of the full-length peptide
is determined by mass spectrometric analysis of the set of chain-termination products
[51]. Unfortunately, due to different reactivities of the 20 amino acids, the amount of
chain termination varies with peptide sequence. As a result, the amount of full-length
peptide on each bead also varies, biasing the screening against peptides containing slow-
coupling amino acids (e.g., Ile and Thr). Third, because our method employs chemically
synthesized libraries, modified (e.g., pY) and/or unnatural amino acids (e.g., D-amino
acids) can be easily incorporated into the libraries. Fourth, our method is high- throughput and cost effective. By employing partial Edman degradation, we can routinely sequence 20-30 beads in an hour, at a cost of ~US$0.50 per bead. Finally, as demonstrated with all three SH2 domains from SHP-2 and SHIP, our method is highly reproducible. Our method is readily applicable to other protein or non-protein receptors.
We have subsequently applied this method to determine the sequence specificity of BIR domains (Chapter 4), WW domains, and Chromodomains (unpublished results).
Our lab had previously determined the sequence specificities for the two SH2 domains of SHP-1 using the method of Youngquist et al. [78]. Comparison of the 62 specificities of the five SH2 domains of SHP-1, SHP-2, and SHIP revealed that these
SH2 domains have overlapping and yet distinctive specificities. Although all five
domains are capable of binding to peptides containing the ITIM,
(V/I/L/T)XpYXX(I/L/V), there are clear differences between SHP and SHIP SH2
domains. For example, SHP SH2 domains require a hydrophobic residue at the –2 position, whereas SHIP SH2 domain can tolerate most of the amino acids at the N- terminal side of pY. On the C-terminal side, SHIP SH2 strongly prefers a leucine at +2 position, but SHP SH2 domains have no such requirement (except for class II peptides of
SHP-2 N-SH2 domain). There are also more subtle but yet significant differences at the
+1 position; while SHP SH2 domains all prefer an alanine at this position, alanine is never found at this position among all of the SHIP SH2-binding sequences. Among the four SHP SH2 domains, the two N-terminal SH2 domains have similar specificities and the two C-SH2 domains are analogous to each other. Most of the SHP-2 N-SH2-binding peptides tested also bind to SHP-1 N-SH2 domain with similar affinities (Table 3.4). The only exception is the class III peptide, IHpYLYA, which has high affinity and selectivity for SHP-2 N-SH2 domain. The two C-SH2 domains have two major differences. The most preferred residues at the –2 position are valine, isoleucine, and leucine for SHP-1 C-
SH2 domain, whereas for SHP-2 C-SH2 domain, they are threonine and valine
(Appendix Figure A3.4). At the +3 position, the most preferred residue is leucine followed by valine and isoleucine for SHP-1 but is isoleucine followed by valine and leucine for SHP-2.
The SH2 specificity data are very useful in understanding the different functions of SHP-1, SHP-2, and SHIP in cell signaling. For example, immunoreceptor PD-1 has previously been reported to bind SHP-2 but not SHP-1 [112]. The pY motif responsible 63 for SHP-2 binding has the sequence TEpYATIVF, which is a perfect match to the
consensus sequence of SHP-2 C-SH2 domain, but not to that of SHP-1 [78]. Many
receptors, however, contain multiple ITIM motifs that match the specificities of both
proteins and are able to bind both SHP-1 and SHP-2. For example, the first ITIM motif
of human Siglec-11 (LHpYASL) closely matches the consensus sequence of SHP-1 SH2 domains, whereas its second ITIM, TEpYSEI, resembles the consensus of SHP-2 C-SH2 domain [82]. Some receptors contain ITIM motifs whose sequences represent a compromise between the consensus of SHP-1 and SHP-2 SH2 domains. Biliary glycoprotein 1 (CD66), which is known to bind both SHP-1 and SHP-2, is such an example [83]. Its two ITIMs (VTpYSTL and IIpYSEV) match the overlapping specificities of SHP-1 and SHP-2 SH2 domains. SHIP has been reported as the main inhibitory molecule for immunoglobulin G Fc receptor signaling pathway by binding to the pYSLL motif on the Fc receptor [80, 117]. Our data show that pYSLL matches the consensus sequence of SHIP SH2 domain and binds latter with much greater affinity than the SH2 domains of SHP-1 or SHP-2 (Table 3.4). The specificity data can also be used to
predict the interaction partners of the SH2 domain-containing proteins. As described
above, a simple database search identified almost all of the known SHP-1 and SHP-2
interacting proteins (Table 3.5). It is highly probable that some of the other predicted
proteins in Table 3.5 will prove to be bona fide SHP-1 and SHP-2 binding proteins.
In summary, a powerful combinatorial library method has now been developed
for the systematic determination of sequence specificities of protein interaction domains
such as SH2 domains. The specificity information generated by this method will be very
useful in understanding the cellular function of proteins that contain these interaction
domains and the design of specific inhibitors against such protein domains. 64
VIpYANI IHpYAVI ILpYSTI TIpYTII TVpYSIV VIpYANI PIpYAVI TTpYSTI TYpYTMI TIpYSMV IEpYAQI NMpYAVI VHpYSTI TIpYTEI TVpYSEV YTpYAQI VLpYAII TYpYSSI TYpYVEI TVpYTEV AIpYASI VYpYAII IVpYSQI IQpYVQI TVpYASL YSpYASI VYpYAII VTpYSQI TKpYVVI VYpYATL IWpYASI TQpYAII YTpYSQI TLpYAVV YLpYATL THpYASI SNpYAII VEpYSEI TRpYAVV IQpYAVL TIpYATI MYpYAII TYpYSMI TYpYAVV TApYAIL TSpYATI YQpYAII TFpYSRI VTpYAIV VTpYATI INpYAMI YYpYSRI IHpYATV VGpYATI THpYAMI TRpYTQI TVpYASV LYpYATI TMpYAMI VIpYTQI TRpYAKV NApYATI TTpYAAI VTpYTSI IIpYSQV VApYAVI YKpYARI VFpYTTI VIpYSSV VHpYAVI YMpYAHI HFpYTTI VIpYSVV IApYAVI YMpYAEI TIpYTVI TQpYSIV
Table 3.1. Selected SHP-2 C-SH2 domain-binding sequences (77 total). All sequences were obtained from a screening experiment performed with 10 nM SHP-2 C-SH2 domain. Boldface, peptides derive from the most intensely colored beads; plain text, peptides from beads of medium color intensity; italics, sequences from the lightly colored beads; M, norleucine.
65
Class I IRpYVEL VTpYTLI LNpYIVI WTpYSLQ ITpYTYI IApYVEL IVpYTLL LHpYAII WTpYYLF IRpYTYV INpYVQL IMpYTII VVpYAII WTpYVLY Class IV INpYVEI MNpYVTL VTpYALI WTpYYLI ILpYMIP INpYVQI MNpYVIV WTpYQIL Class II TEpYMVP IWpYVSI LRpYIQV WTpYQIT WMpYRII IQpYMVL IQpYVML LRpYMQL WTpYVIT WMpYKIY VLpYMQP IQpYVML LRpYVRV WTpYVTS WMpYNIG VMpYMQP IHpYVMI LRpYVSV WTpYSYT WMpYYIQ LVpYMGP IIpYVVI LHpYVSV WTpYQYV WMpYRLY LHpYMGP ISpYIEI MHpYVQV ITpYRLV WMpYRLI ALpYMIP INpYIEV LYpYLQI ITpYLIG WMpYQLS PMpYMIA ISpYIEV LYpYANI WSpYKIY WMpYYLT ILpYFIP ILpYTEV LYpYAQV WSpYVLV WMpYYLY VIpYFVP LVpYTEV LFpYAEI WMpYTLN Class III IVpYFVP ITpYTEV VMpYAEI WMpYMMP IHpYLYA IIpYFYP IFpYTAV LRpYAKL WMpYRMN TLpYLYA VQpYFIR IFpYTAI LYpYATI WMpYRYQ YTpYLVA IYpYTPV LVpYATI Class V WMpYYQV IFpYLYS IMpYTDI LTpYVTI QMpYYLY WIpYFIR VMpYLYS IYpYTDI LRpYVSI LYpYYQY WIpYTIG VVpYMYS IVpYADI LNpYMTI LRpYLVY IIpYTIG IVpYLYT IRpYAQI MYpYATI ITpYLVY WIpYYTR LNpYLYM IVpYAML MYpYATI LNpYMTF WVpYTIN VLpYLYP VApYVEL YApYATI MSpYMVF WVpYRID IKpYTYP VTpYVQL SYpYASI YNpYMVF WVpYYIG IMpYTYP VYpYTEI MYpYARI LNpYVIF WVpYYIR ITpYTYP VYpYTQI YIpYTTV LNpYVLF WVpYYTY VVpYMYT VIpYAQL YVpYTAI LYpYTSF WVpYRLE VTpYMYT VNpYTTL LNpYAVI LYpYATF WTpYSLA YVpYTYT VTpYTII LRpYAVI RApYIVM WTpYSLY ISpYTYI
Table 3.2. Selected SHP-2 N-SH2 domain-binding sequences (150 total). Boldfaced sequences derive from the most intensely colored beads during 10 nM N-SH2 domain screening. M, norleucine.
66
PFpYSLL TMpYSFL GGpYTMM YDpYVLM VGpYYFI PRpYSLV TLpYSLL AHpYTLM GTpYYLM IKpYYYL PApYSMI TYpYSVL ALpYTMM GIpYYYL LGpYYLL PPpYSMM AMpYSFM SApYTLM GIpYYYM LQpYYML PYpYSFI AYpYSYI STpYTYM GGpYYVI LLpYYYV PFpYSFI AVpYSIL SSpYTLL GGpYYFM MGpYYLL PPpYSFM NYpYSYL SPpYTLM GYpYYLM MKpYYMM PRpYSYM QGpYSMM NGpYTLL AYpYYLL MQpYYLM PKpYSYL RGpYSML NMpYTLL AYpYYLV MVpYYYV PKpYSYV VYpYTLL HPpYTLM ATpYYYM FNpYYLL PApYSYI VYpYTLM HYpYTLM AVpYYLM FYpYYLL PRpYSVM TGpYTLL QYpYTMM AHpYYLM FKpYYLM PFpYSVI TIpYTMM RWpYTLM AGpYYMI FApYYML PLpYSTL TYpYTFI WLpYTLM AGpYYFV FNpYYMI PMpYSTM IQpYTLL VGpYVLL PIpYYLL FMpYYYL PRpYSTM PYpYTLI LGpYVMM PGpYYLI YGpYYML PLpYSIL PFpYTLI SLpYVLL SApYYMM WYpYYLL FVpYSLM PMpYTLM AGpYVFM SApYYYV WVpYYLV FApYSYL PRpYTLM DGpYVLM SYpYYYV HPpYYLL YFpYSLM LPpYTLM EGpYVYM SFpYYYI HPpYYLM YQpYSIM YVpYTLM GApYVFL TGpYYLL NApYYML YGpYSMM YMpYTLL HKpYVLL TGpYYLL QIpYYLL YApYSYV YGpYTLM HPpYVLM TGpYYLI EGpYYFM YLpYSYV YYpYTYI MGpYVML TGpYYYM PFpYFLL HSpYSLL FTpYTLM PLpYVLM TTpYYYL TVpYFLL LYpYSLL FFpYTLL QGpYVLL TSpYYYI QTpYFLM VLpYSLL FQpYTMM QGpYVML TRpYYLM SGpYFLM VYpYSLL FTpYTYM TSpYVLL VGpYYLL YGpYFLM VYpYSLV MTpYTLM VSpYVLL VKpYYLL AGpYFYV VYpYSYI GYpYTLI VGpYVYM VGpYYMI TMpYAFI VYpYSYL GPpYTLI WGpYVML VTpYYMM KGpYQLL GGpYTMI WApYVMM VYpYYYL
Table 3.3. Selected SHIP SH2 domain-binding sequences (158 total). Boldfaced sequences were from the most intensely colored beads from 10 nM SH2 domain screening. M, norleucine.
67
1017.6
732.5 845.5
946.6
ITA 1607.8 1431.8 1708.9 1423.7 1536.8 pY1260.6 Y Nle A T
800 1000 1200 1400 1600 m/z
Figure 3.1. MALDI mass spectrum from a C-SH2 domain selected bead nine rounds of partial Edman degradation. The doublet at m/z 1423.7 and 1431.8 indicates that the residue N-terminal to tyrosine is a norleucine. M*, methionine prior to CNBr cleavage and homoserine lactone after CNBr cleavage.
68 70 -2 10 nM 60 50 nM 50 40 30 20 10 0 DENQHKRWFYML I VT SAGP 70 -1 60 50 40 30 20 10 0 DENQHKRWFYML I VTSAGP 70 +1 60 50 40 30 20 Occurrence 10 0 DENQHKRWFYML I V TSAG P 70 +2 60 50 40 30 20 10 0 DENQHKRWFYML I VTSAGP 70 +3 60 50 40 30 20 10 0 DENQHKRWFYML I V TSAGP
Figure 3.2. Specificity of the C-SH2 domain of SHP-2. Amino acids are identified at
each positions–2 to +3 relative to pY (position 0). Occurrence represents the number of
selected sequences containing each amino acid at that position. Open bar, results from
screening at 10 nM C-SH2 protein (77 sequences); closed bar, results from intensely colored beads from screening at 50 nM C-SH2 protein (90 sequences); M, norleucine.
69 A B 40 40 -2 -2 30 30
20 20
10 10
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 40 40 -1 -1 30 30
20 20
10 10
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 40 40 +1 +1 30 30
20 20
10 10 Occurrence
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 40 40 +2 +2 30 30
20 20
10 10
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 40 40 +3 +3 30 30
20 20
10 10
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP Figure 3.3. Specificity of the N-SH2 domain of SHP-2. Amino acids are identified at each positions–2 to +3 relative to pY (position 0). Occurrence represents the number of selected sequences containing each amino acid at that position. (A) class I sequences; and
(B) class II sequences. M, norleucine. 70 80
60 -2
40
20
0 DENQHKRWFYML I VTSAGP 80
60 -1
40
20
0 DENQHKRWFYML I VTSAGP 80
60 +1
40
20 Occurrence 0 DENQHKRWFYML I VTSAGP 80
60 +2
40
20
0 DENQHKRWFYML I VTSAGP 80
60 +3
40
20
0 DENQHKRWFYML I VTSAGP
Figure 3.4. Specificity of SHIP SH2 domain. Amino acids are identified at each positions–2 to +3 relative to pY (position 0). Occurrence represents the number of selected sequences containing each amino acid at that position. M, norleucine.
71
SHP-2 SHP-1 SHIP NSH2 CSH2 NSH2 CSH2 SH2 1) IHpYLYA 0.28 ± 0.04* 12 ± 2.5 7.0 ± 0.43 27 ± 6.9 13 ± 0.95
2) LVpYTEV 1.4 ± 0.12 8.5 ± 1.2 3.2 ± 0.17 8.8 ± 1.3 3.8 ± 0.26
3) VApYVEL 3.6 ± 0.11* 3.7 ± 0.10 4.9 ± 0.10 10 ± 0.92 5.2 ± 0.21
4) LVpYATI 1.9 ± 0.14* 2.0 ± 0.20 1.6 ± 0.51 5.2 ± 0.51 3.5 ± 0.15
5) ITpYTYP 2.4 ± 0.22* 9.7 ± 1.9 2.1 ± 0.07 11 ± 3.4 3.2 ± 0.24
6) WMpYRII 3.0 ± 0.39* 20 ± 6.2 8.5 ± 0.98 23 ± 5.0 6.3 ± 0.36
7) WIpYRII 7.2 ± 0.73* 11 ± 0.55 6.5 ± 1.2 30 ± 8.4 11 ± 2.1
8) WTpYQIL 9.7 ± 0.32* 10 ± 0.41 17 ± 1.0 59 ± 3.6 3.8 ± 0.32
9) TIpYATI 3.9 ± 0.26 0.6 ± 0.07* 6.4 ± 0.42 2.4 ± 0.11 2.2 ±0.19
10) NApYATI 34 ± 2.9 3.9 ± 0.42* 28 ± 2.62 16 ± 2.2 10 ± 0.51
11) PFpYSLL 9.7 ± 0.89 9.2 ± 1.5 5.6 ± 0.47 14 ± 4.5 0.20 ± 0.03*
Table 3.4. Dissociation constants (µM) of selected pY peptides toward SH2 domains.
All peptides are N-terminally acetylated and contain the C-terminal linker, LNBKR-NH2.
The lysine side chain was acylated with a PEG4-biotin moiety for immobilization. The
SH2 domains were constructed as N-terminal six-histidine fusion proteins. M, norleucine.
The asterisk indicates the SH2 domain by which each peptide was selected in screening.
72
2500 a [N-SH2] (µM) 2000 5.2 2.6 1.3 1500 0.65 0.33 1000 0.16 Response Units Response 0.08
500 RUeq
0 20406080100120140160 Time (s)
2000 b
1500 eq 1000 RU
500
0 0123456 [N-SH2] (µM)
Figure 3.5. SPR analysis of the binding of the SHP-2 N-SH2 domain to peptide
IHpYLYA. (a) Overlaid BIAcore sensograms at indicated concentrations of N-SH2 protein (0.08–5.2 µM). (b) Secondary plot of resonance signal under equilibrium binding conditions against SH2 concentration. The data were fitted to the equation RUeq = RUmax x [SH2]/(KD + [SH2]).
73
Table 3.5. Human proteins predicted to bind to SHP-1 and/or SHP-2 via SH2 domains.
These 76 candidate proteins remained after excluding repeated, fragmented, secreted, and extracellular proteins from the 420 possibilities returned from the database search. aProteins that have previously been shown to bind to SHP-1 via its SH2 domains. bProteins that have previously been shown to bind to SHP-2 via its SH2 domains.
74
Protein Binding Motif(s) Ref. Activating NK cell receptor 2B4b TIYSMI, TLYSLI [81] adenylate cyclase, type VI VSYVVL, IAYTLL adipocyte G protein-coupled receptor 175 LVYSLV, YVYAGI alternative splicing factor-1 VCYADV, TAYIRV alternative splicing factor-3 VCYADV, VGYTRI B and T lymphocyte attenuatora,b LLYSLL, IVYASL, TEYASI [84] beta-hexosaminidase β-subunit IEYARL, TTYSFL biliary glycoprotein-1 (CD66, CEACAM-1)a,b VTYSTL, IIYSEV [83] coagulation factor II (thrombin) receptor VCYVSI, VHYSFL, YVYSIL dol-P-man dependent α(1-3)-mannosyltransferase VAYTEI, YDYTQL Ewing’s sarcoma protein-1 LVYTSI, YPYSVL exportin-7, ran-binding protein-16 IGYSSV, TFYTAL, SYYSLL G protein-coupled receptor RDC1 VLYSFI, TEYSAL G6b-B protein of MHC IIIa,b LLYADL, TIYAVV [85] H-rev107-like protein (HRLP-5) VKYSRL, VQYSLI human germinal-center associated lymphoma protein LCYTLI, TEYSLL immune receptor expressed on myeloid cells-1 (polymeric LCYADL, VEYVTM, ISYASL [86, 87] immunoglobulin receptor)a TEYSTI immunoglobulin superfamily receptor translocation LVYSEI, VVYSEV associated-1 (IFGP-2) immunoglobulin superfamily receptor translocation VVYSEV, IIYSEV associated-2 immunoglobulin-like transcript 2, leukocyte VTYAEV, VTYAQL [88] immunoglobulin-like receptor-1 (MIR-7)a immunoglobulin-like transcript 3 (LIR-5)a VTYAKV, VTYAQL [89] immunoglobulin-like transcript 5 (LIR-3) VTYAPV, VTYAQL inhibitory receptor protein 60 (IRC-1)a,b LHYANL, VEYSTV, LHYASV [90] interleukin 8 receptor α (CXCR-1) IAYALV, ILYSRV, IIYAFI interleukin 8 receptor β (CXCR-2) IIYALV, ILYSRV, LIYAFI killer cell Ig-like receptor 2DL1 (p58, NKAT-1)a,b VTYTQL, IVYTEL [91, 92] killer cell Ig-like receptor 2DL2 (NKAT-6) VTYTQL, IVYAEL killer cell Ig-like receptor 2DL3 (NKAT-2) VTYAQL, IVYTEL killer cell Ig-like receptor 3DL1 (p70, NKB-1, NKAT-3)a,b VTYAQL, ILYTEL [93] leucine rich neuronal protein (LRCH-4) VFYVVL, VTYTRL leukocyte antigen (CD84)a,b TIYTYI, TVYSEV [94] leukocyte-associated immunoglobulin-like receptor-1a VTYAQL, ITYAAV [95] lipid phosphate phosphorylase-1 (phosphatidic acid LPYVAL, IPYALL phosphatase-2a) metabotropic glutamate receptor-2 LCYILL, VCYSAL metabotropic glutamate receptor-3 LCYILL, ICYSAL metabotropic glutamate receptor-4 LSYVLL, ISYAAL metabotropic glutamate receptor-7 LSYVLL, ISYAAL multiple C2-domain and transmembrane region protein-2 LRYIIL, VQYAEL natural killer inhibitory receptor NKG2-Aa,b VIYSDL, ITYAEL [96] natural killer-,T-, B-cell antigen receptora,b LEYVSV, TVYASV, TIYSTI [97] 75 neuropeptides B/W receptor type 1 (GPR7) VVYAVI, VLYVLL novel protein similar to PRAME LSYVLL, IHYSQL olfactory receptor 1F1 LFYSTI, VLYTVV olfactory receptor 8D1 ILYSIL, VFYTTV olfactory receptor 12D2 LRYTVI, LFYAPV, IMYTVV olfactory receptor 12D3 ISYSSV, LRYTVI, IMYSAV olfactory receptor 51B5 ISYVLI, VFYVTV olfactory receptor 51V1 TVYTVL, LRYSSI osteoblast-specific factor-2 IKYIQI, IKYTRI paired immunoglobin-like type 2 receptor alpha (FDF03)a,b IVYASL, TLYSVL [98, 99] phosphoribosyl transferase domain containing-1 LEYVLI, IGYSDI PIG-M mannosyltransferase VRYTDI, YRYTPL platelet endothelial cell adhesion molecule-1 (CD31)a,b VQYTEV, TVYSEV [100] polycystin-1, polycystic kidney disease-related protein-1 VTYTPV, VQYVAL, LNYTLL protein KIAA0319 (contains polycystic kidney disease 1 IFYVTV, TKYTIL domains) protein zero relatedb VIYAQL, VVYADI [101, 102] R3H domain protein-1 IPYTSV, VYYSVI ran-binding protein-17 VGYILL, TFYTAL, TSYTML ICYSAL SH2 domain-containing phosphatase anchor protein-1a VVYSQV, VIYSSV [103] sialic acid binding Ig-like lectin-2 (CD22)a VTYSAL, IHYSEL, VDYSEL [104, 105] sialic acid binding Ig-like lectin-3 (CD33)a,b LHYASL, TEYSEV [106] sialic acid binding Ig-like lectin-5 (OBBP-2) LHYASL, TEYSEI sialic acid binding Ig-like lectin-6 (OBBP-1) LHYAVL, TEYSEI sialic acid binding Ig-like lectin-9 (FOAP-9)a LQYASL, TEYSEI [107] sialic acid binding Ig-like lectin-11a,b LHYASL, TEYSEI [82] sialic acid binding Ig-like lectin-12 (S2V)a,b IQYASL, YEYSEI [108] signal regulatory protein alpha-1 (SHPS-1, BIT, MyD-1, ITYADL, TEYASI, LTYADL [109, PTPNS-1)a,b 110] sodium channel type V alpha subunit (cardiac muscle LNYTIV, IMYAAV, TTYIII alpha-subunit) IEYSVL sodium channel type XI alpha subunit (peripheral nerve INYTII, IIYAAV, VSYIII sodium channel 5, hNaN) IKYSAL solute carrier family 19, member 3 (SLC19A3) LNYVQI, VGYVKV somatostatin receptor 1b VIYVIL, VLYTFL, LCYVLI [111] spastic ataxia of Charlevoix-Saguenay IHYTLL, YTYAII trace amine receptor-5 (GPR102) LTYSGA, ILYSKI ubiquitin-specific protease-9, X chromosome (DFFRX) VMYANL, YQYAEL ubiquitin-specific protease-9, Y chromosome (DFFRY) VMYANL, YQYAEL zinc finger protein 521 VGYTSV, VTYSCI
76
CHAPTER 4
DETERMINATION OF THE TETRAPEPTIDE LIGAND SPECIFICITIES OF
THE BIR2 AND BIR3 DOMAINS OF XIAP BY COMBINATORIAL
LIBRARY SCREENING
4.1 Introduction
Apoptosis, the process of genetically programmed cell death, fulfills an essential requirement in normal physiology by eliminating unwanted cells during embryogenesis,
immune cell education, viral infection, and after environmental insult [118, 119]. As
might be expected for any process having such a final outcome its application must be
timely and appropriate, and thus regulation is paramount. Excessive or inappropriate
apoptotic activity is by definition contrary to viability and associated with several disease
states, whereas insufficient apoptotic activity results in pathological phenotypes such as autoimmune disorders and cancers [120]. Maintaining the proper balance requires that caspases, the cysteine proteases responsible for the initiation and execution of cellular dismemberment, be held in check until the appropriate time. Regulation of these powerful proteases is multi-layered, beginning with their synthesis as relatively inactive zymogens. Activation requires internal proteolytic processing, resulting in small and large sub-units which dimerize in an α2β2 stoichiometry to form a catalytically competent protease [121, 122]. Further regulation through the direct binding of inhibitor- 77 of-apoptosis proteins (IAPs) and blocking of dimerization or catalytic sites of processed caspases occurs before the caspase activities are released [121, 123-125]. These IAPs are in turn regulated by small soluble proteins which bind to the baculoviral IAP repeat (BIR) domains within IAPs, thereby sequestering the inhibitors and releasing the caspase activities [126]. Thus, the balance between cellular survival and demise is largely determined by a macromolecular titration of caspases by IAPs, and in turn, IAPs by their antagonists.
The central importance of IAPs in the regulation of the apoptotic process is demonstrated by their ubiquitous distribution in the animal kingdom and strong evolutionary conservation. Following the discovery of the first IAPs capable of blocking apoptosis in lepidopterin cells, baculoviral proteins Op-IAP and Cp-IAP, homologues were soon identified in Drosophila, mouse, and human. Subsequently, IAP genes have been isolated from insect, avian, piscean, mammalian, and viral sources [118, 127]. As with other proteins associated with the death process, widespread identification has been greatly aided by the high degree of sequence similarity among IAPs [119]. This sequence similarity has been extended by functional complementation experiments.
Illustrating the degree of general conservation in the apoptotic machinery, IAPs from baculoviruses (which infect arachnoid hosts) and Drosophila have been demonstrated to inhibit apoptosis in human cell lines, and human IAPs can reciprocate in Drosophila cells
[128-130]. At the root of this interspecies functional compensation is the conserved architecture of the IAPs and in particular the nearly superimposable structures of the BIR domains [118].
All IAPs characterized to date contain one to three BIR domains and many posses
C-terminal RING (really interesting new gene) finger domains, although variation in the 78 C-terminus is not uncommon. BIR domains are relatively small, ~70 amino acids, and
contain a zinc ion coordinated by invariant Cys and His residues. One of the most
characterized members of the IAP family is human X-linked IAP (XIAP). It includes
three N-terminal BIR domains (BIR1, BIR2, and BIR3) and a canonical C-terminal
RING finger, which has been demonstrated to posses ubiquitin ligase activity [131]. The
BIR2 and BIR3 domains of XIAP, which share 40% sequence identity [132], require free
N-termini in their binding partners for recognition. The tetrapeptide sequences NH2-
ATPF, derived from the small subunit of processed caspase-9, has been demonstrated to bind the BIR3 domain and is necessary for the inhibition of caspase-9 activity [133].
Moreover, the peptide NH2-AVPIA derived from the N-terminus of mature Smac, has
been shown to bind to both domains [134, 135]. Peptides similar to the Smac and caspase-9 N-termini are classified as IAP binding motifs (IBMs). Interestingly, the BIR1 domain, despite sharing 41% sequence identity with BIR2 [136], has never been observed in protein-protein interactions. Additionally, IAPs have been suggested to play roles in
processes outside of apoptosis, such as Survivin’s role in cytokinesis [137, 138]. Thus, it
seems likely that BIR domains may be able to recognize molecules other than caspases
and their antagonists.
It was therefore of interest to us to define the entire population of potential
binding candidates by individually screening each of the BIR domains of XIAP against a
synthetic tetrapeptide library. By synthesizing the library on the solid phase we could
ensure an unbiased free N-terminus. Additionally, this N-terminus was ideal for sequencing by the partial Edman degradation technique employed by our lab. Although the BIR1 domain failed to exhibit any affinity towards this library, the other BIR domains specificities and tolerances of the other two domains were quite different. The BIR3 79 domain generally selected for IBMs, but it was more willing to accept a Val at the N- terminus (P1 position) than BIR2. And while neither domain showed strong selectivity at
P2, the selectivity at P3 was much broader for BIR2 than BIR3. And finally, a discreet difference was observed at P4, with BIR2 selecting Val and small hydrophobes in contrast to BIR3’s strong preference for Phe and Ile residues. We present herein the results of library screenings performed under various stringencies for the BIR domains of
XIAP.
4.2 Experimental Techniques
4.2.1 Vector Constructs
Construction of the pETMAL, pPPTmal, and pGFPmal vectors was described in
Chapter 3. For mammalian cell culture co-immunoprecipitation experiments probing
XIAP and caspase-10d (casp10) interactions, the mammalian expression vectors pEBG-
SrfI and pCMV-SPORT6 were used, respectively. The pEBG plasmid accepted XIAP without modification as described in section 4.2.2, thus creating a GST-XIAP fusion. A pCMV-casp10 construct was obtained from the labs of Dr. Yusen Liu and was altered to include a C-terminal 3x-hemagglutinin (3xHA) tag. In order to insert the 3xHA tag sequence, QuikChange mutagenesis was performed to introduce a unique Bam HI restriction site 5’ to the casp10 stop codon. The primers for this mutation were 5’-CCC
TGG ATG CAC TTT CAT TAG GAT CCT AGC AGA GAG TTT TTG TTG G-3’ and its complement. The unusual length of these 46-mers resulted from the high A/T content of the 3’ end and required purification by 12% urea-PAGE before reaction. QuikChange was performed according to the protocol described in Chapter 3 and the thermocycling regimen 1 x 95 ºC (30”), followed by 18 x 95 ºC (30”), 52 ºC (1’), 68 ºC (15’), and an 80 additional 15’ at 68 ºC extension. Digestion of the parent plasmid by Dpn I and
transformation into XL1 Blue yielded only 6 colonies, of which 2 appeared correct by
restriction mapping. The 3xHA tag was sub-cloned from a pSRα-3HA-Jnk1 construct
(also from Dr. Liu) [139] by PCR using the primers 5’-TTT GCA GAA GCT CAG AAT
AAA CGC-3’ and 5’-GGC ACT CGA GCT AGC TAG TCA CGC TTG CTC GCC AT-
3’. Despite not encoding a restriction site in the former primer, the PCR product contained a Bam HI site encoded by the pSRα vector 10 base-pairs downstream of primer hybridization. Thus, the PCR fragment was ligated at unique Bam HI and Xho I sites shared with the mutagenized pCMV plasmid. Overlapping dideoxy sequencings confirmed the authenticity and fidelity of the C-terminally fused 3xHA tagged caspase-
10d construct.
Additional mutants of casp10 were constructed for suppression of the apoptotic phenotype by QuikChange kit. The following primers (plus complement) were used to make the indicated mutants: C401S, 5’-CAT CCA GGC CTC CCA AGG TGA AGA G-
3’; DA415,416GS, 5’-CGT ATC CAT CGA AGC AGG ATC CCT GAA CCC TGA
GCA GG-3’. The C401S mutant involves the active site Cys residue, whereas the
DA415,416GS mutation removes the inter-subunit cleavage site at Asp 415 while inserting an Eco RI site.
4.2.2 XIAP BIR Domain and Full-Length Constructs
The DNA sequences coding for the BIR1 (aa 1-123), BIR2 (aa 124-240), and
BIR3 (aa 241-356) domains were isolated by PCR from a pGEX-4T plasmid containing the BIR 1-3 domains of XIAP [140]. The DNA primers used were: BIR1, 5’- GGA ATT
CAT GAC TTT TAA CAG TTT TGA AGG-3’ and 5’-CTT GAA GCT TGT CCT CAG 81 GAT CCC AGA TAG TTT TCA AG-3’; BIR2, 5’-GAG AAT TCA GAG ATC ATT
TTG CCT TAG ACA GG-3’ and 5’- CTG GAA AGC TTA TTC ACT TCG AAT ATT
AAG ATT CC-3’; and BIR3, 5’-CGA ATT CTC TGA TGC TGT GAG TTC TGA TAG-
3’ and 5’-GTA CGA AGC TTA AGT AGT TCT TAC CAG ACA CTC C-3’. PCR products were digested at the underlined sites with the restriction endonucleases Eco RI and Hind III, and ligated into their corresponding sites in pMAL-c2. The sub-cloning was repeated from the pGEX-BIR1-3 template to the pETMAL vector using the same sets of primers. Subsequently, each domain was further sub-cloned into pPPTmal, pGFPmal, and pET-28a vectors by PCR using the malE and T7-terminator sequencing primers and pETMAL templates as described for SH2 domains in Chapter 3.
Full length XIAP including the RING domain was cloned from a Marathon human spleen cDNA library (Clontech) by PCR according to the iQ Supermix protocol with the primer set: 5’-GAG GAT CCA TGA CTT TTA ACA GTT TTG-3’ and 5’-GGT
CTA GAT TAA GAC ATA AAA ATT TTT GC-3’. Specifically, a 50 µL reaction containing 5 µL Marathon library, 0.4 µM each primer, and 25 µL iQ Supermix
(BioRad’s proprietary mixture of dNTPS, buffer, and polymerase) was thermocycled 1 x
94ºC (3’), 33 x 94 ºC (35”), 57 ºC (35”), 72 ºC (1.5’), and allowed an extra extension for
4’ at 72 ºC. The cloned DNA was restriction digested with Xba I follwed by treatment with Klenow fragment to achieve a blunt 3’ end. The vector pET-28a was similarly made blunt at the unique Hind III site. Both the vector and the gene were digested with Bam
HI, purified by spin column, and ligated at room temperature (45’) in the presence of Eco
RI before transformation into XL1 Blue E. coli. Dideoxy sequencing confirmed the ligation of human XIAP in-frame in the pET vector. In a nearly identical fashion, the full
82 length cloned XIAP was ligated into a pGEX-2T vector, except the blunt end was
instilled in the vector at the Eco RI site and ligation took place in the presence of Sma I.
Lastly, a mammalian GST-fusion construct was generated in the pEBG vector.
The full length XIAP gene was amplified from the pET construct using the T7 promotor and terminator primer set. In this way the gene gained a unique 3’ Not I restriction site from the pET vector, while retaining the 5’ Bam HI site. A pEBG vector containing the rat MPK-1gene was received from Dr. Liu. For unknown reasons, the native pEBG-Srf I vector without an inserted gene gave poor ligation results when digested by Bam HI and
Not I. The pEBG-MPK-1 construct facilitated agarose gel purification following complete Bam HI/Not I digestion, and following similar digestion of amplified XIAP gene, ligation at the complementary sites proceeded efficiently. Dideoxy sequencing confirmed the pEBG-XIAP construct’s authenticity. All pCMV and pEBG constructs for mammalian cell culture experiments were purified by Midi-Prep according to Qiagen’s protocols prior to transfection.
4.2.3 Purification and Lableling of His6-MBP-BIR Proteins
All His6-MBP-BIR domain fusion proteins were expressed, purified, and biotinylated precisely as described for the analogues SH2 domain constructs. Several aliquots of purified fusion protein were labeled by either 2 eq. or 4 eq. of fluorescein-OSu in a manner exactly like that for biotin-OSu. Furthermore, these fusion proteins were purified without any labeling for SPR binding affinity measurements as described in
Chapter 3.
83 4.2.4 Purification of GST-BIR1-3, GST-XIAP, and GST Control Proteins
E. coli BL21(DE3) Rosetta CodonPlus cells harboring the proper pGEX plasmid
were grown in LBA medium to the mid-log phase and induced by the addition of 300 µM
isopropyl-β-D-thiogalactoside (IPTG) for 4~6 hr at 31 ºC. The cells were harvested by
centrifugation and lysed in Buffer 22 by passing through a French press. Each GST
protein was purified from the crude lysate by immobilization on ~10-mL of GST-Bind
Resin (Novagen) according to manufacturer’s recommended procedures. After washing
with two column volumes of Buffer 22 minus protease inhibitors and 4 column volumes
of Buffer 23, ~1 mL of resin was removed and added to an equal volume of glycerol for
storage at -20 ºC, while another ~1 mL of resin was removed and stored at 4 ºC. The
remainder of the resin was eluted by 150 mL of Buffer 23 containing 10 mM reduced
glutathione (GSH). The eluted protein was concentrated, and quantitated by the Bradford
method for estimation of the resin-bound GST protein’s concentration. After addition of
glycerol to 40%, the protein was flash frozen in a dry ice/isopropanol bath and stored at -
80 ºC.
4.2.5 Synthesis of BIR Libraries
The first library was synthesized on 5 g of 130-µm TentaGel S NH2 resin using
standard Fmoc chemistry. All other aspects of library synthesis were identical to those
described for pY library synthesis, with exception of pY and constant N-terminal sequence incorporation. The de-protected library NH2-XXXXLNββRM-resin is referred
to as the BIR library. A second, related tetrapeptide library was synthesized differing in
its C-terminal linker. The same general FMOC chemistry and Leu/Lys/Nle capping
strategies were employed, but the library was synthesized on 90 µm beads carrying ~25% 84 fewer copies of each peptide and a quaternary ammonium salt in place of the constant
Arg of the linker. Synthesis was begun by exhaustive derivatization of 1 g of TentaGel S
NH2 resin by Met as usual, but in place of Arg, Fmoc-Lys(BOC)-OH was next
incorporated. FMOC de-blocking was followed by exhaustive chain-termination by (3-
carboxypropyl)trimethylammonium chloride. Removal of the BOC side-chain protection of Lys was affected by treatment with TFA containing 1% H2O and 4%
triisopropylsilane (v/v) for 1 hr. After washing, the peptide was elongated by β-Ala and
Asn residues. During the coupling of the next residue, Fmoc-β-Ala, the chain-
terminating residue Ac-Gly (25 mol%) was included in order to reduce the density of
peptides on each bead by approximately that percentage. The random positions were
synthesized next as previously described. Thus, the resultant library NH2-
XXXXβNβ(+)KM-resin is referred to as the K+ library.
4.2.6 Colorimetric Library Screening
All aspects of the SA-AP based library screening were identical to the SH2
colorimetric library screening with two exceptions. Only 50 mg of resin were screened
per reaction and the concentrations of the domains screened more varied. The BIR1
domain was screened versus the BIR library at concentrations up to 2 µM. The BIR2
domain was screened versus the BIR library at concentrations from 500 to 1 nM. The
BIR3 domain was screened versus the K+ library at from 500 nM to 2 µM.
4.2.7 Fluorimetric Library Screening
The screening of His6-MBP-BIR3 versus 50 mg of the K+ library was attempted
fluorimetrically using several procedures. In all cases, the preparation of the resin was 85 same as for any other screening, including organic to aqueous washes and blocking with
buffered gelatin solutions. The first screening was performed akin to an SA-AP procedure in which 1 µM of domain was incubated overnight with library before washing and re-suspending in a Petri dish for analysis. However, for bead examination a low- power Olympus SZX12 microscope fitted with the fluorescence excitation and observation optics was necessary. Namely, a high-voltage, high-intensity Hg vapor lamp light source was coupled with an excitation filter (460-490 nm) and an emmission filter
(510-550 nm) (Olympus). The fluorescent beads from the screening were picked manually and checked for loss of fluorescence in an 8 M guanidine-HCl solution. Beads which failed to stop fluorescing under such conditions were discarded. This test was necessitated by the inherent fluorescence observed for some native TentaGel S resin beads. A second screening approach was attempted in which larger volumes of lower domain concentration solutions were screened against the K+ library. In this procedure,
50 mg of resin was incubated overnight at 4ºC in 8 mL of a 500 nM BIR3 domain solution. Work-up was as before. The last form of this screening involved direct incubation in the Petri dish of 50 mg of resin with 3 mL of 1 nM BIR3 domain for 8 hr. at
4 ºC and fluorescent microscopic observation without washing.
4.2.8 Partial Edman Degradation and Peptide Sequencing
The positive beads from each color intensity category were pooled and subjected to partial Edman degradation as generally described in Chapters 2 and 3. Specifically for this library, the beads were suspended in 160 µL of 66% pyridine (aq) containing 0.1%
Et3N, to which an equal volume of 5% PITC in pyridine containing Nic-OSu as (6:1 mol
ratio PITC:Nic-OSu) was added. All other aspects of the degradation were as described. 86
4.2.9 Synthesis of biotinylated pY peptides
All test peptides contained a common C-terminal linker, -BBK-NH2. Each peptide
was synthesized on ~65 mg of CLEAR-amide resin using standard Fmoc/HBTU/HOBt
chemistry. The terminal FMOC protection was left in place during cleavage and
sidechain de-protection (previously described using Reagent K). Following Et2O
trituration, approximately 3 mg of the crude peptide was dissolved in a minimal volume
of DMSO (200–400 µL, with sonication) made basic with DIPEA and reacted with 1.1 equiv of NHS-PEG4-biotin (Quanta Biochem) dissolved in 25 µL of DMSO. After 45
min at room temperature, piperidine was added to 30% and allowed to mix for 20 min.
The mixture was acidified with TFA and re-triturated twice with 20 volumes of Et2O.
The precipitate was collected and dried under vacuum, and the biotinylated peptides were purified by reversed-phase HPLC on a C18 column (Vydac 300Ǻ 4.6 x 250 mm). The identity and purity of each peptide was confirmed by MALDI-TOF mass spectrometric
analysis. This procedure resulted in the addition of a 15-atom hydrophilic linker between
the side chain of the C-terminal lysine and the carboxyl group of biotin.
In the instance of peptide VKTFLEABE(PEG-biotin)-NH2, which contained an
internal Lys residue, the above procedure was modified by the replacement of the linker
Lys with a Glu(PEG-biotin) moiety during synthesis. This substitution was accomplished
using the reagent FMOC-Glu(PEG-biotin)-OH from NovaBiochem. The terminal FMOC
was removed prior to cleavage/de-protection and HPLC performed as above.
87 4.2.10 Determination of Dissociation Constants by BIAcore
All measurements were made at room temperature on a BIAcore 3000 instrument
as described in Chapter 3 using the MBP-fusions for each domain.
4.3 Results
4.3.1 Library Construction and Screening
Having previously demonstrated the validity and usefulness of the synthetic library screening methodology with SH2 domains requiring phosphotyrosine residues, we adapted the techniques for application to the screening of BIR domains. The design of the library was similar, utilizing the same C-terminal linker. Briefly, the terminal Met allowed for the specific release of the peptide from the resin upon treatment with CNBr, and the neighboring Arg provided a locus for ionization of the released peptide during
MALDI-TOF analysis. Two β-alanines increased the distance between the randomized positions of interest and the bead surface, thereby providing the binding domains with better access to the ligand region. Finally, relatively inert Asn and Leu residues were incorporated to increase the mass of the linker above 600 Da in the mass spectrum for ease of interpretation. After synthesis of the random positions by the split-pool method
[4, 6], a library of the form of NH2-XXXXLNBBRM-resin was achieved, where X
represents norleucine (Nle) or any of the natural amino acids, excluding Met and Cys. In
theory, a complete library included slightly more than 1.3 x 105 (194) unique members
and was completely covered in less than 150 mg of 130 µm TentaGel resin (8.87 x 105 beads/g).
Repeated fluorimetric screenings conducted by any of the methods described failed to yield reproducible results in terms of the number of positive beads observed. 88 Furthermore, the few beads which were subsequently analysized either yielded no signal
by MALDI-TOF MS or yielded sequences which bore no resemblance to those obtained
by the colorimetric method (Appendix Table A4).
A typical colorimetric screening experiment involved incubating a BIR domain,
which had been purified and biotinylated as the C-terminal fusion of maltose binding
protein (MBP), with 50 mg of library overnight at 4ºC with gentle mixing. After multiple
washings, domain selected peptides recruited a streptavidin-alkaline phosphatase
conjugate to the bead surface via biotin and became colorized by dye deposition upon
addition of the phosphatase substrate, 5-bromo-4-chloro-3-indolyl phosphate (BCIP).
The stringency of each screening was controlled by adjusting the concentration of the
BIR domain and was reflected in the number of positive beads obtained. The positive
beads were removed manually by pipette under a low-power dissecting microscope and
sequenced by the partial Edman degradation technique.
The BIR1 domain yielded no binding interactions when screened at a domain
concentration as high as 2 µM, whereas the BIR2 and BIR3 domains yielded 234 and 25
positive beads, respectively, at 500 nM concentrations. The BIR2 domain selected beads could be separated into two groups based on the color intensity of the positive beads;
there were 14 “dark” blue beads and 220 “light” blue beads. In the case of the BIR3
screening, no such lightly colored beads were discernable, and thus only the 25 were
collected. A sampling of 31 peptides from the BIR2 light population and the entirety of
other groups were sequenced and analyzed (Table 4.1). As expected, an alanine was
strongly selected for by both domains at the N-terminus (position 1, P1) and BIR3
strongly selected Phe and Ile residues at P4 in agreement with known IBM sequences.
Unexpected, however, were the numbers of Arg and Lys residues selected by both 89 domains at positions P2 and P3, and the virtual absence of Phe and Ile residues at P4 for
BIR2. These initial data intrigued us and prompted further investigation.
4.3.2 Binding Specificity of the BIR2 Domain
After the initial screening performed at a domain concentration of 500 nM, it appeared likely that a non-Smac-like IBM would emerge for BIR2, and so it was decided that more stringent conditions should be tested using lower domain concentrations in order to precisely define preferred ligands. Thus, three additional screening experiments
were attempted at MBP-BIR2 concentrations of 50, 10, and 1 nM (Table 4.1, Fig. 1).
Correspondingly, the number of binding events decreased substantially with decreasing
domain concentration, yielding 67, 12, and 5 positive beads. As was the case in the 500
nM screening, BIR2 showed exquisite selectivity for alanine at P1. At P4 Val and small
hydrophobes such as Ala and Gly were the most strongly selected residues under the 1
and 10 nM conditions, but at 50 nM laxity at this position began to surface.
Progressively scanning the last position from the 1 to 500 nM screenings there is a
noticeable increase of Tyr, Pro, and hydrophilic residues (Ser, Asp, and Glu), but still a
dearth of Phe and Ile residues. The selectivity for the P2 and P3 residues showed a more
rapid increase in laxity with increasing domain concentration. Selection at P2 was
generally for β-branched residues (Val, Thr, Ile) under stringent conditions, however,
selectivity was quickly lost as nearly every amino acid was represented among the 500
nM sequences sampled. To a lesser degree, the same was true for the P3 position. While
P2 had become tolerant of acidic, basic, hydrophilic, and larger aliphatic residues by the
50 nM screening, the P3 position never accepted acidic residues and remained limited
mostly to smaller hydrophobes and basic residues. Based on the screening derived 90 peptides we can define the consensus binding motif for the BIR2 domain of XIAP as
H2N-AX(A/+/V/y/p)(V/a/g), in which X represents any amino acid, + stands for a basic
amino acid, and lower case letters are indicative of lesser selected residues.
4.3.3 Binding Specificity of the BIR3 Domain
The situation following the initial 500 nM screening experiment involving the
BIR3 domain proved to be quite different from that for the BIR2 domain. The
expectations that Smac-like AVPI or caspase-9-like ATPF ligands would be found were
partially met in the initial screening. The observation of basic residues at the P2 and P3
sites was at first unexpected, and higher stringency screenings were planned as for the
BIR2 domain. However, it was realized that the consensus already pointed toward the
expected Smac-like motif and, in contrast to the BIR2 domain, higher stringency
conditions were most likely to produce an answer we already expected, thus not really
furthering our understanding of the BIR3 domain’s full ligand potential. Therefore, it
was decided that truly new ligands for this domain should be explored through further
screenings performed at equal or lower stringencies instead. Toward this end, additional
50 mg aliquots of library were screened at 500 nM, 1 µM, and 2 µM concentrations of
MBP-BIR3. The resulting peptide sequences are listed according to screening condition in Table 4.2 and plotted together in histogram form in Figure 4.2.
At all concentrations, the BIR3 domain strongly preferred an N-terminal Ala.
Unlike BIR2, however, BIR3 tolerated a small but significant population of sequences having Val and small residues at P1. The selectivity at P2 weakly favored β-branched and basic residues, while, in contrast to the BIR2 domain, acidic residues were almost completely absent from this position. The preference for Pro and Arg at P3 changed little 91 with decreased screening stringency, although Ile and Ala residues began appearing in the 2 µM domain screening. And lastly, the preference at P4 for Phe over Ile and Tyr residues progressively lessened with decreasing stringency. From the above screening data, we defined a consensus binding motif for the BIR3 domain of XIAP as H2N-
(A/v)(+/β)(R/P)(F/I/y).
4.3.4 Affinity Measurements of Selected Peptides
In order to confirm the binding of selected peptides, representative peptides were re-synthesized and tested by surface plasmon resonance (BIAcore). All of the peptides tested bound to their cognate BIR domains with high affinity (KD = 0.46-3.8 µM) (Table
4.3). These values agreed well with KD values previously determined by alternative techniques for similar peptides [134, 141]. As expected from the screening results, peptides 5 and 6 were quite specific for BIR3 due to BIR3’s greater tolerance of N- terminal Val residues, while peptides 1 through 4 showed reasonable cross-reactivity despite containing P4 residues not generally preferred by BIR2. Among the BIR2- selected sequences, specificity is conferred by the increased tolerance of the BIR2 domain for P3 and P4 residues, while peptide 9’s dual specificity demonstrated BIR3’s willingness to accommodate a P4 Val approximately as well as a Tyr residue.
4.3.5 Database Search for Potential BIR2 and BIR3 Binding Partners
Because of the high degree of phylogenetic conservation among apoptotic machinery, we originally performed web-based searches for proteins containing consensus sequence motifs described for the individual BIR domains (web site: http://pir.georgetown.edu/) against all known genomes. However, the results returned 92 were too numerous to be of practical use in this setting. We therefore limited our
database searches to human proteins. As a result of the requirement of a free N-terminus,
we performed the searches in several different ways. The first attempt was the most direct, searching for the N-terminally anchored sequences NH2-AX(RKPVAY)(VTYAG)
for BIR2 and NH2-(AV)X(RKP)(FIY) for BIR3 returned 347 and 73 candidate proteins,
respectively. However, the majority of these sequences were immunoglobulins or
fragments more likely representing incomplete DNA sequencing than proteolytic
processing, and therefore, the likelihood of a true free N-terminus was unknown. Thus,
modification of the searches was in order.
Numerous additional database searches were performed by modifying the N-
terminus. In one search schemes, an Asp residue or entire caspase cleavage sites (e.g.
IEAD) were added N-terminal to the Ala residue in order to resemble proteolytic
processing. However, the results were too numerous (≥1500 hits) to be of use, unless one
were looking for specific targets. Alternatively, several known IAP-antagonists are
processed to reveal their IBM motif by the removal of the N-terminal Met. Moreover,
since one of the intentions of the screening was to define a large ligand space to search
for less canonical potential interaction partners, it was decided to search for targets
possessing a free N-terminus as a result of methionine aminopeptidease (MAP)
proteolysis. For this reason, database searches were performed by adding an amino terminally anchored Met residue to the search pattern, assuming it could be lost during
MAP processing to expose the desired N-terminal residue. The BIR2 domain search
pattern was changed to NH2-MA(RKVTI)(RKPA)(VPAG) in order to reduce the number
of hits to 387, and similarly, the BIR3 search pattern NH2-M(AV)X(RKP)(FIY) returned
238 candidates. After removing redundant or fragment protein sequences, and unlikely 93 candidates such as secreted or extracellular proteins, 127 and 65 potential targets remained for BIR2 and BIR3, respectively. For brevity’s sake, unnamed and hypothetical proteins were omitted from Tables 4.4 and 4.5.
One additional database search was performed for BIR3 domain ligands. In this instance, instead of searching for a pattern, a specific peptide sequence, VKTF, was searched against the human genome. This sequence was deemed special because of the
N-terminal valine residue and its exact duplication in two separate screenings. Among the 330 hits returned was caspase-10, and the VKTF coincides with the inter sub-unit processing site.
4.3.6 Probing BIR3-Caspase-10d Interactions
After Midi-Prep purification of the pCMV and pEBG constructs described earlier, mammalian transfection experiments were perform in the laboratories of Dr. Yusen Liu.
These experiements have tentatively revealed a mild inhibitory relationship between the
BIR3 domain and caspase-10d. Further experiments are on-going.
4.4 Discussion
The combinatorial library method reported in this work provides a more complete picture of BIR domain binding motifs. Compared with previous domain screening methods, our technique has several significant advantages that yielded novel sequences and additional insight into BIR domain binding. First, as previously described our technique yields individual peptide sequences determined subsequent to screening, which is a necessary improvement over the pooled sequencing and chain termination encoding methods employed with synthetic libraries in the past. Second, in contrast to phage 94 display methods, the risk of positional bias due to enzymatic processing and secretion of displayed coat proteins is greatly diminished or absent from synthetic libraries [34-36].
Moreover, the occurrence of such biases in a pIII-fusion phage library has been
demonstrated to be most severe at positions +1 and +3 from the N-terminus, two
important recognition residues involved in BIR domain binding [142]. Third, despite
recent advances in the incorporation of several unnatural amino acids into phage display
libraries [143], the monomeric diversity of synthetic libraries is nearly limitless and
technically far less demanding, allowing future peptidomimetic inhibitor screenings. And
finally, in the absence of selection and amplification, many different stringency
conditions can be screened in order to discover alternative ligand types. Screenings
conducted under multiple stringencies without enrichment provide qualitative
assessments of positional contributions to ligand binding can be made by monitoring the
relative laxity introduced by changing conditions.
The BIR2 and BIR3 domains of XIAP have been extensively characterized by
many different techniques, and several binding partners, caspases-3,-7, and -9, Smac,
HtrA2, GSPT1, and Chk1, have been described in vitro and in vivo [133, 135, 144-147].
Unlike the BIR containing IAPs survivin and BRUCE, the physiological function of
XIAP has to date been limited to apoptosis inhibition, and its defined binding partners all
posses standard N-terminal IBMs. However, by defining the entire peptidic ligand space
for these domains it is expected that new target proteins may emerge, possibly in
biological systems outside of apoptosis. Additionally, as a result of measuring the
binding affinities of the peptides derived from these screening, the sequences themselves
may be of use in developing new, more selective domain-specific peptide-based
inhibitors for research use. 95 Previous library screenings has been described using phage display libraries and
various BIR domains, including XIAP’s [35, 148]. Comparison of these literature results
and those derived here reveals general agreement, with a few notable differences. Both
techniques agree there are marked contrasts between the BIR2 and BIR3 consensuses,
but, two main differences arise in the details of the individual consensuses. First, the P1
Val residue tolerated by the BIR3 domain had not been elucidated in the earlier
screenings. In fact, Franklin et al. reported stronger P1 Ala specificity for the BIR2
relative to BIR3 domain, whereas we observed the opposite trend even at 500 nM domain
concentration. Our BIAcore studies of peptides the N-terminal Val group suggest that
they are legitimate potential targets for the BIR3 domain based upon the observed KD,
and that they are quite specific for BIR3 relative to BIR2 (Table 4.3). Moreover, the
VKTF sequence found in two screenings coincides with the processed N-terminus of caspase-10. Second, as expected from the chosen selection conditions, the selectivity for
Pro at P3 was not nearly as strict as reported for BIR3. However, it is a bit surprising to observe the residue most often selected in its place is Arg, and comparison of the KD
values for peptides 1 and 3 shows only ~2-fold difference.
Although the individual sequences derived from this screening technique are
potentially useful in their own right for future research endeavors, it has been our
intension from the outset to apply these data groupwise in database searches for in vivo
binding targets. Between the BIR2 and BIR3 domains, the latter recognizes ligands
which confom more closely to standard Smac-like IBM, except for the P1 valine-
containing sub-group. It was this group that piqued our interest because of it novelty.
The ligand search for VKTF returned an interaction candidate among apoptosis-related
proteins, caspase-10. Subsequent mammalian cell lysate experimnets conducted in the 96 laboratory of our collaborator have demonstrated a weak inhibitory relationship between the BIR3 domain and caspase-10d activity. Further experiments are being pursued in order to confirm these tentative data.
4.5 Conclusion
The current work demonstrates the feasibility and potential usefulness of the resin-bound library screening technique. The ability to sequence large numbers of beads individually, rapidly, and inexpensively with minimal encoding marks a significant advancement in the use of modified peptide libraries. The three SH2 and two BIR domains screened in this work are from relatively well-characterized proteins. Our screening results both agree with the established literature and extend the possibilities for discovering unique interactions. While further improvement and refinement of the described methodology itself is anticipated, the application of the derived data to the design of future biological experiments is the true goal. Toward this end, successful collaborations are being cultivated and will undoubtedly yield insights into the complex workings of signal transduction.
97
1nM 50nM ATVT AVQA AMGY 500nM AVHA ALRG AVVV AYPV AKAT ARIG AISY AVKV ASYA AMRG ATPV ADVV AQGT AEVG AVSY ATKV ASVA ATRP AIKV AQAV AQST ASYG AYPS ASRV AQKA AIRP xYPY AKAV ASRT ARYG AAVS ANAV AQAA AYNP ASMV ATRT AYSG ARVS AWAV AQYA AKVP 10nM AYMV AKAI ADKG AQYS ATKT ARNA AEPF AVVV AVWV AEMI AEKG AYNS ATAT ATVG ATSS AYPV ANYV ATRI AVVP AVRS AVGT AQPG ATAN ASAV ARSV ATYA AVNP AIVD AVMT AQKG AEAD ARAT AVNV AMYA AVSF AIKD ANAT AEAG ARKE ATIA AIQV ARYA AKRF AYAE AVHI AEIG ATRH AINA AKRV AAIA AIPY AQGE AVKI ADAG ASAK ATKG ANRV ARIA ATPY AQPE AMKI ARQG AQKR AVKG AWRV ARVA AIAY AQAQ AQYG AYRV AQNA ARAY EFAS AIAK AHKV AYAA AEGY FVAG
Table 4.1. Peptides selected by the BIR2 domains during screenings at the indicated domain concentrations. Boldface indicates the “dark” beads selected by the BIR2 domain.
98
A B 80 40 60 P1 30 P1 10
10 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 25 20
20 P2 P2 15
15 10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 25 20
Occurrence 20 P3 Occurrence P3 15
15 10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP
25 20
20 P4 P4 15
15 10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP
Figure 4.1. Histogram of the BIR2 selected sequences. (A) Combined 1, 10, and 50 nM domain concentration screenings. (B) 500 nM domain concentration. The extreme discrimination of alanine residues is apparent from all screening conditions. 99
3 x 500 nM AVPY AIMF AIKF AIPI ARPA AFPF AAMF AIRF AYPI ARRS AYPY AKIF ATRF ARPI VKPF AHPF AKIF ASRF ARPI VRPF AHPY AKVF AMRF ARPV VRPI AMPF AKSF AMRF AIRI VKTF AAPY AMAF AMRF AKRI TKRF ASPF AIAF AYRF ARRI FDHI AGPF AIGF ARRY ARPS MQII ARPF AKAF ARRY ARPS RQTA AKGF QGRW 1 x 1 µM ATPF AMGF AGRF ATPI ATRS ATPY ARIF ARRF ASPI ARPQ AAPF AFKF AYRF AYKI VTGF AHPF AKKF AFRY AVRI VARF AYPY ARKF AKRY AVRI VQRF AKPF AKMF AKRY AIRI VKTF AIAF AARF ARRY AARI SRPF AKAF ASRY AVPI AKRI ESPW AKAF AVRF ATPI ARPG 1 x 2 µM ATPF AFRF ARPI ARRI ARPN AVPF AYRF AVII AKRI AVPE AAPF ALRY AVII ARAT ARRE ATAY AGRF AKII ALPT VAIF AAAF AKAY AKIV AFPT VRAF AAVF AKSF AQII AAPT VRRF AVIF AMTF ARVI ARPT VQRF AIMF AQRY ARTI ATRT TKRI AKMF AAPI ANAI AIPS IRTF AVKF AAPV AVRI AIRA IRRF AFRF AFPI AIRI AFPG NHGW
Table 4.2. BIR3 domain selected peptides from indicated screening conditions.
Boldface is used to highlight the duplication of the unusual VKTF sequence in two separate screenings.
100 130 P1 120
40
20
0 DENQHKRWFYML I VTSAGP 75
60 P2
45
30
15
0 DENQHKRWFYML I VTSAGP
75
60 P3 Occurrence 45
30
15
0 DENQHKRWFYML I VTSAGP 75
60 P4
45
30
15
0 DENQHKRWFYML I VTSAGP
Figure 4.2. The combined histogram of the BIR3 domain screenings performed at 500 nM, 1 µM, and 2 µM concentrations.
101
Peptide BIR2 BIR3 1 ATPF 2.3 ± 0.3 0.46 ± 0.03 2 ATPI 2.7 ± 0.2 1.0 ± 0.08 3 ATRF 3.4 ± 0.2 1.1 ± 0.2 4 ATRY 6.9 ± 0.9 1.4 ± 0.2 5 VRRF N/B 0.85 ± 0.07 6 VKTFLEAb N/B 2.1 ± 0.2 7 AVVV 0.87 ± 0.3 N/B 8 ATRP 3.8 ± 0.2 N/B 9 AVRV 1.0 ± 0.1 1.5 ± 0.2
Table 4.3. Dissociation constants (µM) of library selected peptides toward BIR2 and
‡ BIR3 domains. All peptides, except ( ), contain the C-terminal linker -BBK-NH2, the lysine side chain of which was acylated with a PEG4-biotin moiety for immobilization by
BIAcore streptavidin sensorchips. Each domain was prepared as a C-terminal fusion to
MBP as described in Methods. N/B indicates no detectable binding at 30 µM domain
concentration. ‡Peptide was derived from the N-terminus the of large sub-unit of
processed caspase-10d, this peptide was synthesized with the C-terminal linker -
BE(PEG-biotin)-NH2. The N-terminal tetrapeptide was selected by a 500 nM and 1 µM
screening.
102
Table 4.4. Potential BIR2 mediated binding partners derived form the Protein Interaction
Resource database search.
103
Protein Motif Zinc Finger Protein 202 MATAV Mitochondrial Ribosomal Protein S5 MATAV Tripeptidyl Peptidase II MATAA LAG1 Longevity Assurance Homolog 5 MATAA Squamous Cell Carcinoma Antigen Recognized by T-Cells 3 (SART3) MATAA Remodeling and Spacing Factor 1 MATAA Procollagen (type III) N-Endopeptidase MATAA POU Domain, Class 3, Transcription Factor 3 MATAA WS β-Transducin Repeats Protein MATAA Low-Densisty Lipoprotein B MATAA N-Oct 3 MATAA Nicastrin MATAG Kallikrein 4 MATAG RAB14 MATAP Protein Kinase C, D2 type MATAP Nedd4 Binding Protein 3 MATAP Katanin p80 Subunit B1 MATPV Translation Initiation Factor 3, Subunit 5ε MATPA GRINL1A Downstream Protein Gdown 4 MATPA BCL2-like 2 Protien MATPA ARNT2 Protein MATPA ATP-Dependent RNA Helicase #3 MATPA Calponin Homology Domain Containing 1 MATPG Phosphatidylinositol-4-phosphate 5-kinase type IIα MATPG PP13 MATPP UXT Protein MATPP Deoxycytidine Kinase MATPP Zinc Finger Protein 268 MATRV Gamma-Taxilin MATRV Zinc Finger Transcription Factor BTEB2 MATRV P120 MATRG SH3PX1 Protien MATKA Superoxide Dismutase MATKA Kelch/Ankyrin Repeat Containing Cyclin A1 Interacting Protien (KARCA1) MAVAV Septin 11 MAVAV RNA Helicase MAVAV SCAND2 Protein MAVAV Pinin MAVAV β1-Syntrophin MAVAA Mitochondrial 28S Ribosomal Protein S32 MAVAA Thyroid Hormone Receptor Interactor 4 MAVAG ABCD4 MAVAG Ubiquitin Specific Protease 11 MAVAP COM Domain Containing 10 MAVPA Putative Tumor Suppressor Gene 26 Protein MAVPA THO Complex 3 MAVPA Dachshund MAVPA Phosphodiesterase 3A MAVPG G Patch Domain Containing 3 MAVPG PAP Associated Domain Contianing 1 MAVPG 104 Dual-Specificity Tyrosine-Phosphorylation Regulated Kinase 1B (DYRK1B) MAVPP MICAL-Like Isoform 2 MAVPP Ethanolamine Kinase 2 MAVPP GROS1-L Protein MAVRA Integrin β4 Binding Protein MAVRA Bagpipe homeobox 1 MAVRG Solute Carrier Family 25 MAVKV Dedicator of Cytokinesis 6 (DOCK6) MAIAG Mitochindrial Serine Hydroxymethyltransferase MAIRA L-Asparaginase MARAV Centrosomal Protein of 72 kDa MARAG Spire Homolog 2 MARAG Leucine-Rich Repeats and Immunoglobulin-Like Domains 1 MARPV Winged Helix Domain-Containing Protein MARPV Albumin D-Box Binding Protein MARPV Brain Acyl-CoA Hydrolase (BACH) MARPG ARPG863 MARPG Neuromedin B MARRA Constitutive Androstane Receptor SV16 MARRP SPRY Domain-Containing SOCS Box Protein SSB-3 MARRP c-myb MARRP v-myb myeloblastosis viral oncogene homologue MARRP WD Repeat Domain 40A MARKV aarF Domain Containing Kinase 1 MARKA HSP70-2 MAKAA RNA Processing Factor MAKAG NIR1 MAKAG Ankyrin Repeat Domain 2 MAKAP Annexin VI MAKPA Hexokinase 1 MAKRA Zinc Finger Protein 479 MAKRP Zinc Finger Protein 679 MAKRP Flavin Containing Monooxygenase 4 MAKKV
105
Table 4.5. Potential BIR3 mediated binding partners derived form the Protein Interaction
Resource database search.
106
Protein Motif Ref Checkpoint Kinase (Chk1) MAVPF [147] Creatine Kinase, Mitochondrial 1 MAGPF Nuclear Prelamin A Recognition Factor-Like Variant MASPF Tropomodulin 3 MALPF HMBA-Inducible Protein HEXIM1 MAEPF Similar to Hephaestin MAQPF Translocase of Outer Mitochondrial Membrane 34 MAPKF Sec61 Homolog MAIKF Heart Alpha-Kinase MASKF Mitogen-Activated Protein Kinase 6 MAEKF Four and a Half LIM Domains 1 (FHL-1, SLIM1) MAEKF Aldolase B MAHRF Purigernic Receptor P2X1 MARRF 39S Ribosomal Protein L45, Mitochindrial Precursor MAAPI Microcephalin MAAPI RAB33A, RAS Oncogene Family Member MAQPI GTP-Binding Protein S10 MAQPI Interferon-Responsive Finger Protein 1, Short Form MASKI Ring Finger Protein 38 MACKI Similar to Schlafen 5 MAMKI Similar to 40S Ribosomal Protein S3 MAARI F-Box Protein 47 MASRI Transcription Factor EB MASRI Myeloma Overexpressed MALRI Acyl-CoA Synthase 4 MAKRI Sorting Nexin 16 MATPY NPAS1 Protein MAAPY Hexaribonulceotide Binding Protein 1 MAQPY Phospholipase C Beta 4 Isoform A MAKPY Carboxypeptidase Z Isoform 3 MAWPY Proteosome (prosome, macropain) Subunit, Alpha Type 8 (PMSA 8) MASRY Pheromone Receptor MASRY 60S Ribosomal Protein L36 MALRY Meis2 MAQRY Myeloid Ecotropic Viral Integration Site 1 Homolog 2 Isoform 1 (Meis1) MARRY Meis1-Related Protein 2 (Meis3) MARRY T-Cell Receptor Alpha Chain V Region MVLKF PNAS-124 MVKKF PNAS-131 MVKKF Ubiquinol-Cytochrome C Reductase MVTRF Actin-Filament Binding Protein Frabin MVNKI VPRI645 MVPRI 26RFa Precursor MVRPY Orexigenic Neuropeptide QRFP Precursor MVRPY SEC14L MVQKY
107
CHAPTER 5
MATERIALS and GENERAL METHODS
5.1 Materials.
Escherichia coli peptide deformylase was overexpressed in E. coli BL21(DE3) strain and purified to apparent homogeneity with Co2+ as the divalent metal [54]. All general laboratory chemicals, buffers, and solvents were obtained from either Sigma–
Aldrich (St. Louis, MO) or VWR International (West Chester, PA), and Fisher Scientific
International (Hampton, NH). All electrophoresis materials were purchased from BioRad
Laboratories (Hercules, CA). All peptide synthesis reagents, including Fmoc-protected amino acids, 2-(1H-benzotriazol-1-yl)-1,1,3,3-tetramethyluronium hexafluorophosphate
(HBTU), 1-hydroxybenzotriazole (HOBT), Nic-OSu, and resins were purchased from
Advanced Chemtech (Louisville, KY), Peptides International (Louisville, KY ), or
Novabiochem (Darmstadt, Germany). Streptavidin-Alkaline Phosphatase was obtained from PROzyme (San Leandro, CA). All DNA restriction endonucleases and modifying enzymes were purchased from New England Biolabs (NEB, Beverly, MA), whereas
DNA oligos for PCR were from Integrated DNA Technologies (IDT, Coralville, IA).
DNA Clean-Up Kits and other spin-column-based DNA purification kits were from
Qiagen (Valencia, CA). Other chemicals, including isopropyl-β-D-thiogalactopyranoside
108 (IPTG), phenylmethanesulfonyl fluoride, kanamycin, ampicillin, and β-mercaptoethanol
were purchased from Aldrich.
5.2 Buffers
Buffer 1: 100 mM Bis-Tris, pH 6.7, 1 mM EDTA
Buffer 2: 10 mM HEPES, pH 8.0, 35 mM NaCl
3X Buffer 3: 300 mM Tris, pH 8.25, 15 mM EDTA, 6 M urea
Buff 4: 10 mM Bis-Tris, pH 6.5, 0.5 mM EDTA, 50 mM NaCl
Buffer 5: 50 mM Tris, pH 7.6
Buffer 6: 50 mM Tris, pH 8.25, 2.5 mM EDTA, 1 M urea
Buffer 7: 30 mM HEPES, pH 7.5, 150 mM NaCl, 5 mM BME, 5 mM EDTA, 0.5 mM
benzamidine, 20 µg/mL soybean trypsin inhibitor, 20 µg/mL leupeptin, 20 µg/mL
pepstatin, 0.2 % Triton X-100, 0.5 mg/mL protamine sulfate
Buffer 8: 20 mM HEPES, pH 8.15, 150 mM NaCl, 2 mM BME, 10 mM maltose
Buffer 9: 20 mM HEPES, pH 7.4, 150 mM NaCl, 2 mM BME
Buffer 10: 10 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA, and 10 µM
tris(carboxyethyl)phosphine
Buffer 11: 30 mM HEPES, pH 7.85, 500 mM NaCl, 0.5 mg/mL protamine sulfate, 0.5
mM benzamidine, 20 µg/mL soybean trypsin inhibitor, 20 µg/mL leupeptin, 20
µg/mL pepstatin, 0.1 % Triton X-100 (optional)
Buffer 12: 10 mM HEPES, pH 7.85, 500 mM NaCl
Buffer 13: 10 mM HEPES, pH 7.4, 150 mM NaCl, 125 mM imidizole
109 Buffer 14: 10 mM Tris, pH 7.8, 5 mM NaCl, 1 mM EDTA, 10 mM BME, 0.5 mM
benzamidine, 20 µg/mL soybean trypsin inhibitor, 20 µg/mL leupeptin, 20 µg/mL
pepstatin, 0.3 mM PMSF, 0.5 mg/mL protamine sulfate
Buffer 15: 10 mM Tris, pH 7.8, 5 mM NaCl, 1 mM EDTA, 10 mM BME
Buffer 16: 10 mM Tris, pH 7.8, 500 mM NaCl, 1 mM EDTA, 10 mM BME
Buffer 17: 10 mM MES, pH 5.5, 0.5 mM EDTA, 10 mM BME
Buffer 18: 10 mM MES, pH 5.7, 500 mM NaCl, 0.5 mM EDTA, 10 mM BME
Buffer 19: 30 mM HEPES, pH 7.4, 150 mM NaCl, and 0.05% Tween 20
Buffer 20: 30 mM Tris, pH 7.6, 1 M NaCl, 10 mM MgCl2, 70 µM ZnCl2, 20 mM
potassium phosphate
Buffer 21: 30 mM Tris, pH 8.5, 100 mM NaCl, 5 mM MgCl2, 20 µM ZnCl2
HBS-EP Buffer: 10 mM HEPES, pH 7.4, 150 mM NaCl, 3 mM EDTA, and 0.005 %
polysorbate 20
Strip Buffer: 10 mM NaCl, 2 mM NaOH, and 0.025% SDS
Buffer 22: 10 mM HEPES, pH 7.6, 150 mM NaCl, 2 mM DTT, 0.5 mM benzamidine, 20
µg/mL soybean trypsin inhibitor, 20 µg/mL leupeptin, 20 µg/mL pepstatin, 0.2 %
Triton X-100, 40 U DNase (Promega)
Buffer 23: 10 mM HEPES, pH 7.6, 150 mM NaCl, 8 mM DTT, 1 mM EDTA
TE Buffer: 10 mM Tris, pH 8.0, 1 mM EDTA
Gel Extraction Buffer: 500 mM ammonium acetate, pH 8.0, 1 mM EDTA
110 5.3 General Biochemical and Biological Methods
5.3.1 Materials
Phenol was purchased from Fisher Scientific prepared [149] prior to use and stored in 10 mM Tris.HCl, pH 8.0, 1 mM EDTA, 0.1% β-mercaptoethanol at 4 ◦C for up to 1 month. Chloroform mixture is indicated by 24:1 chloroform and 3-methyl-1-butanol.
Chloroform is ACS certified. Deoxynucleotide mixture was from Stratagene. Ethidium bromide, bromophenol blue, xylene cyanol, ethylenediamine tetra-acetic acid (EDTA),
isopropyl-β-D-thiogalactosidase (IPTG) and DTT were from Sigma.
5.3.2 Growth Media
Dry bacterial growth media (bactotryptone, yeast extract, casamino acids, and
bactoagar) were obtained from Difco. All growth media were prepared with ddH2O and
sterilized by autoclaving for 20 min. at 20 lb/sq. in. on liquid cycle. Antibiotics,
inorganic salts, 20% (w/v) glucose, and 1 M MgSO4, were dissolved in ddH2O and
sterilized by passing through sterile 0.45 µM filters (Acrodisc, Gelman Sciences).
Antibiotics were added directly to the growth media after cooling to less than 50 ◦C.
Luria-Bertani Medium (LB): 10 g/L bactotryptone, 5 g/L yeast extract, 10 g/L NaCl.
LB Plates: LB medium plus 15 g/L bactoagar.
Escherichia coli Strains. The following strains were used in this work. Genotypes are taken from the 2005-2006 NEB catalogue. The description of the Rosetta strain is from the 2005 on-line Novagen catalogue.
- - - BL21(DE3) F ompT hsdSB (rB mB ; an E. coli B strain) gal dcm (DE3). This strain
carries a copy of T7 RNA polymerase gene on its chromosome (λ 111 lysogen). It has neither methylation nor restriction function, and therefore
it is useful for preparing DNA free of methylation. It is also deficient in
lon protease and is good for overproduction of foreign proteins.
BL21(DE3) (Rosetta) This strain is similar to the BL21(DE3) except it contains pRARE
(CmR). The strain supplies tRNAs for the codons AUA, AGG, AGA,
CUA, CCC, and GGA on a compatible chloramphenicol-resistant plasmid.
This strain is used to express gene sequences that contain many codons
infrequently used by E. coli. (i.e. biased codons).
- + r DH5αF’ F’/endA1 hsdR17 (rk mk ) glnV44 thi-1 recA1 gyrA (NaI ) relA1 ∆(lacZYA-
argF)U169 deoR (Ф80dlac∆(lacZ)M15). This train is deficient in DNA
recombination. Plasmid DNA prepared from this strain usually has very
high quality. It is also sometimes used to overproduce proteins.
XL1-Blue F’::Tn10 proA+B+ lacIq ∆(lacZ)M15/ recA1 endA1 gyrA96(NaIr) thi
- - hsdR17(rk mk ) glnV44 relA1 lac.
5.3.3 Growth and Storage of Bacterial Strains
E. coli were stored on agar plates at 4 ◦C for periods up to one month or as frozen
glycerol cultures at -80 ◦C indefinitely.
E. coli strains DH5αF’ and BL21(DE3) were streaked on LB plates and strain
BL21(DE3) (Rosetta) was streaked on LB plus chloramphenicol plates(35 mg/L) if no
additional antibiotic resistance was conferred by plasmid. Cells from a frozen glycerol
culture were streaked on an appropriate plate and then incubated at 37 ◦C until individual
colonies were 1-2 mm in diameter.
112 To store cells as frozen glycerol culture, 0.6 mL of liquid bacterial culture was added 0.4 mL of glycerol in a sterile microcentrifuge tube (1.5 mL). The culture was mixed with the glycerol by pipette and transferred to -80 ◦C for long-term storage.
Overnight cultures were prepared by picking a single, well-isolated colony from
an agar plate and using it to inoculate 1-10 mL of the appropriate sterile medium broth.
Cells were allowed to grow at 37 ◦C for 8-16 hr.
5.3.4 Preparation of Competent Cells
Unless otherwise noted, all sterile pipettes, tubes, and solutions were pre-chilled
to 4 ◦C, and all processes were performed on ice or in a cold room. 250 mL of LB media
was inoculated by the addition of 2.5 mL of the overnight culture of an appropriate strain.
◦ The cells were allowed to grow with aeration at 37 C until OD600 reached 0.4-0.6. The
cells were transferred to a sterile GS-3 centrifuge tube, cooled on ice 5-10 min., and centrifuged at 5,000 rpm for 5 min. The pellet was gently resuspended in 10 mL of ice cold competent solution A (30 mM KOAc, pH 5.8, 100 mM RbCl, 10 mM CaCl2, 50 mM
MnCl2, and 15% (v/v) glycerol) and kept on ice for 5 min. The cell suspension was
centrifuged again (5,000 rpm, 5 min.) and the resulting pellet was resuspended in 10 mL
of ice cold competent solution B (10 mM MOPS, pH 6.5, 75 mM CaCl2, 10 mM RbCl,
and 15% (v/v) glycerol). Competent cells in solution B were stored at -80 ◦C for up to 6
months.
5.3.5 Quantitation of DNA and RNA
Pure samples of DNA and RNA (i.e. free of proteins, phenol, agarose,
nucleotides, or other nucleic acids) were quantitated by measuring the absorbance at 260 113 nm using a UV/vis spectrophotometer as described. An OD260 of 1 corresponded to
approximately 50 µg/mL for double-stranded DNA, 40 µg/mL for large single-stranded
DNA, and 27 µg/mL for single-stranded oligonucleotides. The purity of a preparation of
DNA or RNA was estimated by reading its ratio of OD260/OD280. Pure preparations of
DNA and RNA should have OD260/OD280 values of 1.8 and 2.0, respectively. When the amount of a sample was small (<1 µg) or the sample was heavily contaminated with other substances that absorbed at 260 nm, the quantity of DNA (or RNA) in the sample was estimated by comparing the fluorescence yield of the sample with that of a series of standards.
5.3.6 Protein Quantitation
Protein concentrations were determined by the Bradford method using bovine serum albumin (BSA, Sigma) as standard. The Bradford reagent was obtained from
BioRad as a 6X stock, which was diluted with ddH2O to 1X. Construction of a standard
curve was performed by plotting the observed OD595 for five solutions containing
Bradford reagent versus their known concentrations of BSA (0, 1, 2, 4, and 6 µg/mL).
After regression analysis of the standard curve, the OD measurements of unknown samples could be used to determine the concentrations of the original unknown stock.
5.4 Electrophoresis
5.4.1 Agarose Gel
Agarose gel electrophoresis was used for separation of large DNAs (>150 bp).
Agarose was added to 100 mL of 0.5x TBE buffer (44.5 mM Tris, 41.5 mM boric acid, 1 mM EDTA, pH 8.3) to make final concentration of 0.5-1.5% (w/v). The agarose was 114 melted by heating in a microwave and 1~2 µL of an aqueous solution of ethidium bromide (10 mg/mL) was added. After mixing by gentle swirling, the agarose was poured into a horizontal slab gel tray (110 x 140 mm) and allowed to solidify at 4 ºC.
DNA samples were mixed with 1/6 volume of a 6x loading buffer (30% (v/v) glycerol, 0.25% (w/v) bromophenol blue in ddH2O) and loaded into 5 x 2 (1) x 5-10 mm
(length x width x depth) wells. Electrophoresis was carried at constant voltage (100 V)
submerged in 0.5x TBE buffer until the desired separation was achieved. The DNA-
containing bands were visualized under either long-wavelength (preparative) or short-
wavelength (analytical) UV light.
5.4.2 Polyacrylamide Gels for Protein Separation
Proteins were separated by sodium dodecyl sulfate-polyacrylamide gel
electrophoresis (SDS-PAGE) according to the method of Laemmli [150]. The gel
consisted of a stacking layer (175 x 30 x 1.5or 0.75 mm) and a separating layer (175 x
120 x 1.5 or 0.75 mm). The percentage (10-20%) of the acrylamide of the separating
layer depended on the sizes of the proteins in the sample. An appropriate volume of a
30% acrylamide stock solution (bis(acrylamide): acrylamide , 1:29) was diluted with
. ddH2O, separating gel buffer (325 mM Tris HCl, pH 8.8, 0.1% SDS (w/v)) and 1.6 mg/mL ammonium persulfate. The resulting solution was degassed, activated by the
addition of 8.0 µL of TEMED, and poured between two assembled glass plates. The
surface of the separating gel was smoothed by adding a few drops of n-butanol saturated
with water on the top of the gel. The gel was allowed to polymerize at ambient
temperature for 15-20 min and the n-butanol was removed by rinsing the gel with water.
The stacking gel solution containing 4% acrylamide mixture, 1.6 mg/mL ammonium 115 persulfate, and 8.0 µL of TEMED in 1x stacking gel buffer (125 mM Tris.HCl, pH 6.8,
0.1% SDS) was then added onto the top of the polymerized separating gel. The sample
loading wells were formed by inserting combs of the appropriate width into the top of the
stacking gel.
Protein samples were mixed with an equal volume of a 2x loading buffer (125 mM Tris.HCl, pH 6.8, 4% (w/v) SDS, 10% (v/v) β-mercaptoethanol, 20% (v/v) glycerol,
0.2% (w/v) bromophenol blue) and heated at 95-100 ◦C for 5 min prior to loading. The
gels were electrophoressed in Tris-glycine buffer (25 mM Tris, 192 mM glycine, 0.1%
SDS, pH 8.0) at 200 V until the bromophenol blue had migrated near the bottom of the
gel. The gels were then removed from the glass plates, stained in 40% methanol, 10%
acetic acid in water containing 0.25% (2 hr staining) or 0.05% (overnight staining)
Coomassie Brilliant Blue R-250, and destained by soaking in a solution of 40% (v/v)
methanol, 10% acetic acid in water.
5.4.3 Urea-PAGE Gels for Oligonucletide Purifiation
Oligonucleotides are separated by PAGE gels using urea as the de-naturant. The
percentage of polyacrylamide used in the gel depends on the size of the oligo being
purified (for ~19-mers, a 19% gel was used: for ~26-mers, a 15% gel was used; and for
~45-mers, a 12% gel was used). The ratio of acrylamide to bisacrylamide is always 29:1.
Recipes are shown below.
Thin gel (.4 mm) Thick gel Gel Components Gel Components (in 100 mL) mass (g) (in 600 mL) mass (g) 7 M urea 42.042 7 M urea 252.252 100 mM tris free base 1.2114 25 mM tris free base 1.817 100 mM boric acid 0.6183 25 mM boric acid 0.927
1 mM Na2EDTA 0.0372 0.25 mM Na2EDTA 0.0566 116 12%: 11.59/.41 12%: 69.52/2.48 29:1 acryl/bisacryl 15%: 14.48/.52 29:1 acryl/bisacryl 15%: 86.89/3.103 19%: 18.344/.6552 19%: 110.07/3.93
Buffer components Buffer components (in 2L) (in 2L) 100 mM tris free base 24.228 g 25 mM tris free base 6.06 g 100 mM boric acid 12.366 25 mM boric acid 3.09
1 mM Na2EDTA 0.744 0.25 mM Na2EDTA 0.186
When purifying oligos in the range of 45 nucleotides in length, it is necessary to use a
“thick gel.” When all the gel component materials dissolved, the volume was adjusted ti that indicated and the solution was vacuum filtered through a 0.45 µm filter. After assembly of the 38 x 50 cm gel caster (BioRad), the gel was polymerized by the addition of 1 µL each of 25% ammonium persulfate and TEMED per mL of gel solution and quickly injected into the castor. After polymerization, the comb was removed and the apparatus was warmed over the course of ~1 hr. to ~50 ºC by running the gel at 130 V
(PowerPac 3000 equipped with temperature probe, BioRad). Once equilibrated at 45~50
ºC, the wells were expunged of urea by syringe prior to loading the oligos as a 50% formamide solution up to 2 nmol/lane (15 nmol/lane for thick gels). After resolution by
PAGE, the oligos were visualized against the fluorescent backdrop of a TLC plate in a dark room using a 354 nm UV lamp (UV shadowing). The DNA was extracted from the excised band by incubation in 10 mL of Gel Extraction Buffer overnight. De-salting was performed by HPLC and the DNA quantitated as described.
5.5 Recombinant DNA techniques
5.5.1 Restriction Digestions
All restriction digestions were performed in buffers obtained from NEB.
Complete digestions were usually carried out at 37 ◦C for 1-8 hr using 1.5 units of 117 restriction enzymes for 1 µg of plasmid DNA. More enzymes (0.4-1 unit/pmol DNA)
were used for every microgram of smaller DNA fragments (synthetic DNAs, PCR
products, or restriction fragments). If required, the restriction enzymes were removed
after digestion by Qiagen spin column kits according to the manufacturer’s protocol.
5.5.2 Filling Recessed 3’-Termini and Removing Protruding 3’-Termini
In order to ligate DNA fragments with incompatible termini, they were converted
into compatible forms by partially or completely filling-in the recessed 3’-termini or
removing the protruding 3’-termini. To fill-in a recessed 3’-terminus, 1 µL of a solution containing the desired dNTPs (depended on the sequence of the 5’ overhang and whether partial or complete fill-in was required) at 1 mM was added directly to 0.2-5 µg of DNA
(20 µL in restriction enzyme buffer plus 5 mM MgCl2) digested with appropriate
restriction enzymes. After the addition of the Klenow fragment of DNA polymerase I (1
unit for every µg of DNA), the reaction mixture was incubated at room temperature for
15 min. Removal of protruding 3’ termini was carried out similarly except that T4 DNA
polymerase was used instead of the Klenow fragment, height concentrations of dNTPs
were added (1 µL of 2 mM each), and the reaction was incubated at 14 ◦C for 15 min.
After the reaction, enzymes and dNTPs were removed by either agarose gel
electrophoresis or Qiagen Kit (Qiaquick mini column).
5.5.3 Removal of 5’ Phosphates
In order to prevent undesired self-ligation of a DNA fragment, the 5’ phosphate
groups were removed with calf intestinal alkaline phosphatase (CIP). Dephosphorylation
was carried out in a 50 µL reaction containing digested DNA (up to 10 µg) and CIP (0.01 118 unit/pmol of protruding 5’ termini, 1 unit/pmol of blunt or recessed 5’ termini) in 10 mM
. Tris HCl, pH 8.3, 1 mM MgCl2, 1 mM ZnCl2 buffer. The reaction mixture was diluted to
100 µL in TE buffer and the CIP was removed by extraction with 1:1 mixture of phenol/chloroform (3x), chloroform (2x), followed by ethanol precipitation.
Alternatively, the CIP was removed by Qiaquick spin-column.
5.5.4 Ligation of DNA
Ligation reactions were generally carried out in 1X T4 DNA ligase buffer supplied with T4 DNA ligase at ~16 ºC ◦C for 4-20 hr. Ligations involving blunt-ends
were generally performed at room temperature. The concentrations of DNA varied from
2 µg/mL (sticky-end ligation) to 30 µg/mL (blunt-end ligation). The ratio of insert
DNA/vector DNA varied from 3:1 (restriction fragment to dephosphorylated vector
DNA) to 10:1. The concentrations of T4 DNA ligase varied from 5 (sticky-end ligation)
to 100 (blunt-end ligation) Weis units/mL. The reaction volume was 20-40 µL. The
resulting solution was stored at -20 ◦C until used for transformation.
5.5.5 Transformation
Competent cells in competence solution B were prepared as described previously
and dispensed into 100 µL aliquots on ice. For transformations with purified supercoiled
plasmid DNA, three dilutions of plasmid DNA (approx. 100 ng, 10 ng, and 1 ng of
plasmid in 1-10 µL of TE buffer) were added each to an aliquot of competent cells in a
sterile microcentrifuge tube. For transformations with plasmid DNA from mutagenesis
synthesis reactions or ligation reactions, 10 µL, 1 µL, and 1 µL of 10 fold-dilution of the
reaction mixture (usually containing 1-10 µg/mL DNA) were added to aliquots of 119 competent cells. The DNA/cell suspensions were gently mixed and kept on ice for 30-35
min, after which the tubes were placed in a 37-42 ◦C heat-block for 3 min. The tubes were
centrifuged for 15 sec in a microcentrifuge and the supernatants were carefully
withdrawn with a pipette. The cell pellet was gently resuspended in 1 mL of LB medium
and incubated at 37 ◦C for 30-60 min. The cells were pelleted again by centrifugation for
15 sec and resuspended in 100 µL of LB medium. This culture was evenly spread onto
an LB plate impregnated with the appropriate antibiotic(s). The plates were inverted and
incubated at 37 ◦C for 10-16 hr.
5.5.6 Small-Scale Preparation of Plasmid DNAs
Typically this was performed by isolating a single E. coli colony from the
transformation LB plate and inoculating 5 mL of LB media with the appropriate
antibiotic(s). This method was adapted from that of Holmes and Quigley [151]. After inoculation with aeration at 37 ◦C for 16-24 hrs, cells were pelleted by centrifugation
(5000 rpm, Sorvall SS-34 rotor, 5 min). The liquid supernatant was removed and the cell pellet was resuspended by vigorous vortexing in 100 µL of ice cold Solution I (50 mM glucose, 25 mM Tris.HCl, pH 8.0, 10 mM EDTA). 200 µL of freshly prepared Solution
II (0.2 N NaOH, 1% SDS) was added to the suspension and mixed by inverting 5-6 times and stored on ice for 2-5 min. 150 µL of ice-cold Solution III (60 mL 5 M KOAc, 11.5 mL glacial acetic acid, 28.5 mL ddH2O) was then added and mixed by vortexing in an
inverted position for 10 sec and returned to ice for 3-5 min. The lysed cell suspension
was centrifuged for 5 min in a microcentrifuge at 13,000 rpm. The supernatant was
transferred into a fresh tube and washed with 400 µL of 1:1 phenol:chloroform mixture.
The solution was centrifuged at 13,000 rpm for 2 min. The top aqueous layer was 120 transferred to another fresh tube. The DNA was precipitated by adding 2 volumes of
ethanol and centrifuged at 4 ◦C for 5 min. The pelleted DNA was dried and redissolved in
50 µL TE and stored at -20 ◦C.
5.5.7 Mutagenesis
Site-directed mutagenesis was carried out by QuickChange Site-Directed
Mutagenesis Kit (Stratagene). Reactions were carried strictly as directed by the
manufacture protocol. Mutagenesis primers (35-40 nt) were ordered and synthesized by
IDT DNA Technologies. Pfu DNA polymerase was from Stratagene. Reactions
typically were carried out in 50 µL PCR reaction volume (thin wall) by adding 5-50 ng
vector dsDNA template, 125 ng primers, 0.5 mM dNTP mix, and 2.5 unit Pfu turbo DNA
polymerase, and adjust volume with ddH20 to get 50 µL. Cycling temperatures were
performed as followed: 16-18 cycles of 94 ◦C for 30 s/55 ◦C for 1 min/72 ◦C for 2 min per
1 kbp of plasmid. Plasmids typically contained about 6 kbp and required 12 min at the 72
◦C cycle. At the completion of the cycle, 10 units of Dpn I was added to the mutagenesis mixture and incubated at 37 ◦C for 1 hr. Dpn I treatment was required to remove the
parental or wild-type plasmid. The reaction mixture was then transformed into E. coli as
described previously.
5.5.8 Sequencing
DNA sequencing was performed by the dideoxy chain-termination method
derived from Sanger [152]. Sequencing was performed at the Plant Genomic Facility at
The Ohio State University.
121
APPENDIX: SUPPLEMENTARY SCHEMES, TABLES, AND FIGURES
122 Nde I Eco RI Bam HI Bam HI Eco RI Hind III Hind III Not I Nde I Xho I Xho I 5’ 5’ 6x His malE
pET-28a pMAL-c2
kan amp
Nde I/Xho I 1. PCR 2. Nde I/Xho I Nde I Xho I Nde I Xho I
6x His malE
pET-28a
kan
Eco RI Bam HI T4 DNA Ligase Hind III
6x His malE
pETMAL
kan
Scheme A1. Construction of the pETMAL vector.
123 Hind III Bam HI Eco RI Nde I Eco RV Hind III Xho I Bgl II Eco RV Bam HI Nde I Not I 5’ 5’ 6x His BCD Eco RI
pET-14b PinPoint Xa-1
amp amp
1. Eco RI 1. PCR 2. Klenow 2. Eco RI 3. Eco RV 3. Klenow 4. T4 Ligase 4. Nde I Nde I Xho I Bam HI Nde I blunt
6x His BCD
X pET-14∆RIV
amp Hind III 1. Bam HI Bam HI 2. Klenow Eco RV 3. Nde I Bgl II T4 Ligase Not I
6x His BCD
X pET-PNPT
amp
Scheme A2. Construction of the pET-PNPT vector.
124 Hind III Bam HI Eco RI Xho I Xho I Eco RV Bam HI Bgl II Hind III Not I Xho I Eco RV 5’ 5’ T7 term 6x His BCD 6x His malE
X pET-PNPT pETMAL amp kan
QuikChange SDM Hind III 1. PCR Bam HI 2. Eco RV/Xho I Xho I Eco RV 3. Klenow Bgl II Not I
6x His BCD blunt blunt
X pET-PNPT Linker/MCS Linker = poly Asn amp
1. Xho I/Not I 2. Klenow Eco RI Bam HI T4 Ligase Hind III Xho I
6x His BCD Linker
X pPPTmal amp
Scheme A3. Construction of the pPPTmal vector.
125 Bam HI Eco RI stop Hind III Bgl II Eco RI Not I 5’ 5’ 6x His GFP GATCT Eco RI 1. PCR 2. Bgl II/Eco RI pET29-GFPuv no Bam HI or stop kan Eco RI Bam HI/Eco RI Hind III Nde I Not I GGATC T4 DNA Ligase Eco RI X 6x His GFP 6x His GFP
kan kan SDM: 1. – Nde I 2. S72A Eco RV 3. I167T Eco RI 4. F64L,S65T, V68L Hind III 5. +Eco RV Not I Eco RI Bam HI 6x His GFP Hind III Not I Eco RV/Hind III 6x His GFP Linker kan T4 pGFPmal blunt Hind III Ligase kan Linker/MCS
Similar to pPPTmal scheme
Sceme A4. Construction of the pGFPmal vector.
126
Figure A1. Cooomassie stained 4-15% SDS-PAGE gel (BioRad) of purified SHP-2.
The indicated molecular weights are listed in units of kDa. Lane X is the molecular weight marker (BioRad), lane 1 contains ~2.5 µg, and lane 3 contains ~6 µg of total protein.
127
MApYAVI TRpYSII MKpYASI YEpYSMI VRpYSFV IEpYAII TYpYSII LLpYAEI YHpYSVI VIpYSRV ILpYAII TVpYSNI LNpYALI HYpYSMI TMpYSAV IKpYAKI TFpYSSI LNpYALI QMpYSTI TRpYSIV IIpYAQI TNpYSVI LSpYALI LVpYTEI TYpYSLV IQpYASI AYpYSTI IApYALI TMpYTDI TFpYSPV IRpYASI VYpYTAI IGpYARI TMpYTGI TMpYSSV IVpYASI TIpYTMI VNpYADI TNpYTHI TNpYSYV IYpYASI TApYTQI VRpYADI TQpYTYI SYpYSTV IEpYAVI TYpYTQI VKpYAFI SVpYTTI YSpYSEV INpYAYI TFpYTRI VMpYAMI YIpYTLI VIpYTHV VQpYAFI TIpYTSI VNpYAVI YEpYTQI TFpYTDV VTpYAII TTpYTSI TSpYAAI YYpYTTI YTpYTRV VApYALI TSpYTTI TFpYAFI HFpYTQI YNpYTYV VEpYALI YFpYTTI TApYAII MDpYVQI TEpYVTV VFpYANI HFpYTTI TFpYAKI LHpYVEI YEpYVFV VGpYAQI HYpYTTI TEpYAVI LHpYVSI YMpYTPT VMpYAQI VSpYVQI TKpYAYI ITpYVSI VQpYAAL VApYAVI THpYVII SMpYAQI TDpYVII VSpYARL VHpYAVI TNpYVQI AEpYAEI THpYVQI YYpYAHL VLpYAVI TDpYVVI AMpYAFI PWpYVTI YSpYAIL TTpYANI TEpYVVI AQpYAVI AQpYVHI NRpYAIL TSpYAQI LFpYATV GLpYAHI ATpYVVI VLpYSQL TVpYATI IApYAIV WHpYAEI YNpYVLI IHpYTTL TKpYAVI VSpYATV WNpYAYI YEpYVMI TIpYTML SGpYAII TMpYAFV YApYADI NQpYVQI TIpYTQL SVpYANI IIpYSQV FHpYAQI DApYVSI TMpYTTL SYpYARI IQpYSTV FLpYAVI EApYVAI TRpYTVL SNpYAVI IMpYSVV HSpYAQI RApYVVI TLpYVTL SWpYAVI VTpYSRV QPpYATI YNpYLTI YTpYVYL PFpYAII TFpYSVV NVpYAAI LLpYATV VTpYATM PYpYAII VVpYTAV NYpYADI VIpYADV ITpYTMM AMpYAII TVpYTQV NMpYAFI VLpYAHV VMpYTAM GHpYAII IKpYAIL DKpYAII VYpYAIV TQpYVYF GQpYATI VTpYAQL DGpYAKI VIpYALV TKpYIYF WNpYAII VIpYATL MVpYSQI THpYAQV IVpYMTF YFpYAHI TKpYALL IYpYSPI TRpYAVV IHpYMYF YQpYAII YSpYARL IEpYSTI TRpYAVV SYpYVYY YQpYAKI TApYSIL IApYSYI SMpYAVV YNpYVFY YLpYATI TMpYTTL IEpYSYI AMpYAIV YApYVYY YEpYAVI ITpYTTM TYpYSII YRpYAQV IHpYLTY YHpYAVI VMpYTQM TYpYSQI NYpYATV YRpYMTY YLpYAVI TYpYAWM TPpYSVI IVpYSEV YDpYMYY VIpYSMI TVpYLTY TYpYSVI IFpYSQV IVpYITA VFpYSQI MEpYAEI SFpYSMI IIpYSTV PYpYIFA VVpYSTI MMpYAEI ANpYSTI VIpYSDV TVpYLYA YApYSEI
Table A1. Additional sequences selected against 50 nM SHP-2 C-SH2 domain. Bolface indicates sequences derived from intensely-colored beads. Normal typeface sequences are from medium-colored beads.
128
Class I Class II Class V IHpYVEI TLpYFTL WMpYYIQ WMpYRTV ISpYALF IVpYALI LQpYIIL WMpYYIR WTpYSTV RMpYKLF IYpYATI LHpYLVL WMpYTLE WMpYQYV LSpYLVF IIpYAIL ISpYMVL WVpYYTT WTpYVIY LTpYMSF AYpYAVI IRpYTIL WTpYTLY WMpYYQY LQpYMVF IIpYAAI LHpYTEL WTpYQIM WNpYMVY TApYMVF IYpYADI VVpYTIL WSpYKIY WMpYNYY VNpYMYF IQpYADI IWpYVAL WMpYRGA YVpYYIA MHpYVVF IVpYAII LNpYVEL WIpYQIA YIpYYID YWpYLTF IFpYATI VNpYVIL WVpYTIA RMpYYYH IRpYAYI LYpYAEM WMpYSIA Class III ILpYHIY LRpYAHI LRpYASM WVpYYLA IMpYTYA ITpYITY LLpYAII TLpYLVM WSpYQLA VMpYLYS IVpYLTY LNpYAKI ITpYMAM WMpYNLA LMpYMTS LSpYMYY LNpYAMI ITpYRIM WMpYYID MVpYMYS ITpYTYY LKpYATI ITpYSYM WMpYYID TMpYMYS LHpYTTY MVpYAEI ITpYILN WMpYHMD PMpYLYT LKpYVYY MRpYAEI IMpYQIN WTpYQIE QMpYMYT LTpYYYY VFpYAQI ITpYVIN WMpYKTE IHpYLYP YTpYLIY VYpYAQI ITpYANT WSpYTMF FMpYTYP WMpYYGY VRpYAVI ITpYITT WMpYFIG ILpYFQI LHpYMET WVpYTIH Class IV Other IQpYIAI VIpYMST WMpYRLI ILpYFFP LTpYRIV ITpYIDI IWpYATV WSpYYLI LYpYFSP MVpYIVA INpYINI ISpYAVV WTpYTTI VLpYFAP MNpYIYA LHpYLVI TLpYAIV WMpYRTI VVpYFIP VWpYIVA VQpYLII VIpYAEV WTpYTTI LYpYMPP ITpYYIE LHpYSTI IKpYISV WTpYSVK LRpYMVP LVpYTAI LApYIQV WSpYELL MMpYMTP LTpYTLI LRpYLQV WMpYTSL QMpYMIP LVpYTQI IEpYTAV WTpYHIM QWpYIVP VQpYTEI IQpYTHV WSpYENN LMpYLSP VKpYTEI IFpYTLV WTpYSTR AVpYFIA VMpYTQI ITpYTLV WTpYSYR VWpYMVA VFpYTQI VVpYTEV WIpYTMS VMpYTSI VKpYTPV WMpYNTS VVpYTVI LRpYVFV WTpYTIT LHpYVLI VNpYVEV WVpYTIT LQpYVQI VVpYVQV WMpYQIT TRpYVEI VTpYVQV WMpYFTT VNpYVEI YQpYAII WSpYYTT IMpYYGI YPpYAMI WVpYYTT FPpYAVL YLpYTQI WVpYRYT LMpYANL YApYAQI WSpYTIV YFpYTSI WTpYSLV
Table A2. Additional sequences selected against 50 nM SHP-2 N-SH2 domain.
129
AVpYSLL QLpYTYM MGpYYFM DGpYSLL VVpYTYI QApYYFL YFpYSLV ARpYVLL FRpYYFI SYpYSII YApYVML GGpYYFI FQpYSVM MGpYVYM DGpYYFV AMpYSVM KGpYVYM MYpYYYM PYpYSFA PFpYYLL MQpYYYM WYpYSFI QGpYYLL QApYYYI PTpYSFL WApYYLV ATpYYYI LNpYTLL GTpYYLV YLpYYYI VLpYTLL YFpYYLA PMpYYYI SSpYTLL LTpYYMM QVpYYYV ANpYTLM PQpYYMI FYpYFYA NLpYTLM PFpYYMI YTpYFYV PFpYTML KGpYYVM TVpYALM
Table A3. Additional sequences from lightly colored beads selected against 10 nM SHIP
SH2 domain.
130
AHPV IGNW KTRI REPR HHWQ MSSV RITV PYTW
Table A4. Peptide sequences from beads selected by BIR3 domain fluorimetric screening.
131
40 -2 30
20
10
0 DENQHKRWFYML I VTSAGP 40 -1 30
20
10
0 DENQHKRWFYML I VTSAGP
40 +1 30
20
10
Occurrence 0 DENQHKRWFYML I VTSAGP
40 +2 30
20
10
0 DENQHKRWFYML I VTSAGP
40 +3 30
20
10
0 DENQHKRWFYML I VTSAGP
Figure A2. Composite histogram of sequences selected by SHP-2 N-SH2 domain.
132 A B 20 20
15 -2 15 -2
10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 20 20
15 -1 15 -1
10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 20 20
15 +1 15 +1
10 10
5 5 Occurrence 0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 20 20
15 +2 15 +2
10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP 20 20
15 +3 15 +3
10 10
5 5
0 0 DENQHKRWFYML I VTSAGP DENQHKRWFYML I VTSAGP Figure A3. Additional histograms of lesser represented ligands selected by the N-SH2 domain of SHP-2. (A) Class III, and (B) Class IV ligands. 133
BIBLIOGRAPHY
1. Merrifield, R.B. (1963). Solid phase peptide synthesis. I. The synthesis of a
tetrapeptide. J. Am. Chem. Soc. 85, 2149-2154.
2. Letsinger, R.L., and Mahadevan, V. (1965). Oligonucleotide synthesis on a
polymer support. J. Am. Chem. Soc. 85, 3526-3527.
3. Frechet, J.M., and Schuerch, C. (1971). Solid-phase synthesis of oligosaccharides.
I. Preparation of the solid support. J. Am. Chem. Soc. 93, 492-496.
4. Lam, K.S., et al. (1991). A new type of synthetic peptide library for identifying
ligand-binding activity. Nature 354, 82-84.
5. Houghten, R.A., et al. (1991). Generation and use of synthetic peptide
combinatorial libraries for basic research and drug discovery. Nature 354, 84-86.
6. Furka, A., Sebestyen, F., Asgedom, M., and Dibo, G. (1991). General method for
rapid synthesis of multicomponent peptide mixtures. Int. J. Peptide Protein Res.
37, 487-493.
7. Merrifield, R.B. (1964). Solid phase peptide synthesis. II. The synthesis of
Bradykinin. J. Am. Chem. Soc. 86, 304-305.
8. Merrifield, R.B. (1964). Solid-phase peptide synthesis. III. An improved synthesis
of Bradykinin. Biochemistry 3, 1385-1389.
9. Lenard, J., and Robinson, A.B. (1967). Use of hydrogen fluoride in Merrifield
solid-phase peptide synthesis. J. Am. Chem. Soc. 89, 181-182. 134 10. Carpino, L.A., and Han, G.Y. (1972). 9-Fluorenylmethoxycarbonyl amine-
protecting group. J. Org. Chem. 37, 3404-3409.
11. Carpino, L.A., et al. (2002). The uronium/guanididium peptide coupling reagents:
finally the true uronium salts. Angew. Chem. Int. Ed. 41, 441-445.
12. Albericio, F. (2004). Developments in peptide and amide synthesis. Curr. Opin.
Chem. Biol. 8, 211-221.
13. betterresin1.
14. betterresin2.
15. Scott, J.K., and Smith, G.P. (1990). Searching for peptide ligands with an epitope
library. Science 249, 386-390.
16. Geysen, H.M., Meleon, R.H., and Barteling, S.J. (1984). Use of peptide synthesis
to probe viral antigens for epitopes to a resolution of a single amino acid. Proc.
Natl. Acad. Sci. USA 81, 3998-4002.
17. Mattheakis, L.C., Bhatt, R.R., and Dower, W.J. (1994). An in vitro polysome
display system for identifying ligands from very large peptide libraries. Proc.
Natl. Acad. Sci. USA 91, 9022-9026.
18. Roberts, R.W., and Szostak, J.W. (1997). RNA-peptide fusions for the in vitro
selection of peptides and proteins. Proc. Natl. Acad. Sci. USA 94, 12297-12302.
19. Nemoto, N., Miyamoto-Sato, E., Husimi, Y., and Yanagawa, H. (1997). In vitro
virus: Bonding of mRNA bearing puromycin at the 3’-terminal end to the C-
terminal end of its encoded protein on the ribosome in vitro. FEBS Lett. 414, 405-
408.
20. Merryman, C., Weinstein, E., Wnuk, S.F., and Bartel, D.P. (2002). A bifunctional
tRNA for in vitro selection. Chem. Biol. 9, 741-746. 135 21. Liu, R., Barrick, J.E., Szostak, J.W., and Roberts, R.W. (2000). Optimized
synthesis of RNA-protein fusions for in vitro protien selction. Methods Enzymol.
318, 268-293.
22. Liu, R., Marik, J., and Lam, K.S. (2003). Design, synthesis, screening, and
decoding of encoded one- bead one-compound peptidomimetic and small
molecule libraries. Methods Enzymol. 369, 271-287.
23. Smith, G.P., and Petrenko, V.A. (1997). Phage Display. Chem. Rev. 97, 391-410.
24. Kurz, M., Kuang, G., and Lohse, P.A. (2000). An efficient synthetic strategy for
the preparation of nucleic acid-peptide and protein libraries for in vitro evolution
protocols. Molecules 5, 1259-1264.
25. Matthews, D.J., and Wells, J.A. (1993). Substrate phage: selection of protease
substrates by monovalent phage display. Science 260, 1113-1117.
26. Nixon, A.E. (2002). Phage display as a tool for protease ligand discovery. Curr.
Pharm. Biotechnol. 3, 1-12.
27. Zwick, M.B., Shen, J., and Scott, J.K. (1998). Phage-displayed peptide libraries.
Curr. Opin. Biotechnol. 9, 427-436.
28. Bjorklund, M., and Kiovunen, E. (2004). Steps towards phage display libraries
with an extended amino acid repertoire. Lett. Drug Design Dis. 1, 163-167.
29. Li, S., Millward, S., and Roberts, R. (2002). In vitro selection of mRNA display
libraries containing an unnatural amino acid. J. Am. Chem. Soc. 124, 9972-9973.
30. Noren, C.J., Anthony-Cahill, S.J., Griffith, M.C., and Schultz, P.G. (1989). A
general method for the site-specific incorporation of unnatural amino acids into
protein. Science 244, 182-188.
136 31. Bain, J.D., Diala, E.S., Glabe, C.G., Dix, T.A., and Chamberlin, A.R. (1989).
Biosynthetic site-specific incorporation of a non-natural amino acid into a
polypeptide. J. Am. Chem. Soc. 111, 8013-8014.
32. Frankel, A., Li, S., Starck, S.R., and Roberts, R.W. (2003). Unnatural RNA
display libraries. Curr. Opin. Struct. Biol. 13, 506-512.
33. Tian, F., Tsao, M.-L., and Schultz, P.G. (2004). A phage display system with
unnatural amino acids. J. Am. Chem. Soc. 126, 15962-15963.
34. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997). Identification
of prokaryotic and eukaryotic signal peptides and prediction of their cleavage
sites. Protein Eng. 10, 1-6.
35. Franklin, M.C., et al. (2003). Structure and function analysis of peptide
antagonists of melanoma inhibitor of apoptosis (ML-IAP). Biochemistry 42,
8223-8231.
36. Peters, E.A., Schatz, P.J., Johnson, S.S., and Dower, W.J. (1994). Membrane
insertion defects caused by positive charges in the early mature region of protein
pIII of filamentous phage fd can be corrected by prlA suppressors. J. Bacteriol.
176, 4296-4305.
37. Brenner, S., and Lerner, R.A. (1992). Encoded combinatorial chemistry. Proc.
Natl. Acad. Sci. USA 89, 5381-5383.
38. Needels, M.C., et al. (1993). Generation and screening of an oligonucleotide-
encoded synthetic peptide library. Proc. Natl. Acad. Sci. USA 90, 10700-10704.
39. Seneci, P. (2001). Direct deconvolution techniques for pool libraries of small
organic molecules. J. Rec. Sig. Trans. Res. 21, 377-408.
137 40. Seneci, P. (2001). Encoding techniques for pool libraries of small organic
molecules. J. Rec. Sig. Trans. Res. 21, 409-445.
41. Mitsopoulos, G., Walsh, D.P., and Chang, Y.T. (2004). Tagged library approach
to chemical genomics and proteomics. Curr. Opin. Chem. Biol. 8, 26-32.
42. Wang, P., Fu, H., Snavely, D.F., Freitas, M.A., and Pei, D. (2002). Screening
combinatorial libraries by mass spectrometry. 2. Identification of optimal
substrates of protein tyrosine phosphatase SHP-1. Biochemistry 41, 6202-6210.
43. Sadowski, I., Stone, J.C., and Pawson, T. (1986). A noncatalytic domain
conserved among cytoplasmic protein-tyrosine kinases modifies the kinase
function and transforming activity of Fujinami sarcoma virus p130gag-fps. Mol.
Cell Biol. 6, 4396-4408.
44. DeClue, J.E., Sadowski, I., Martin, G.S., and Pawson, T. (1987). A conserved
domain regulates interactions of the v-fps protien-tyrosine kinase with the host
cell. Proc. Natl. Acad. Sci. USA 84, 9064-9068.
45. Moran, M.F., Koch, C.A., Anderson, D., Ellis, C., England, L., Martin, G.S., and
Pawson, T. (1990). Src homology region 2 domains direct protein-protein
interactions in signal transduction. Proc. Natl. Acad. Sci. USA 87, 8622-8626.
46. Pawson, T., Raina, M., and Nash, P. (2002). Interaction domains: from simple
binding events to complex cellular behavior. FEBS Lett. 513, 2-10.
47. Songyang, Z., et al. (1993). SH2 domains recognize specific phosphopeptide
sequences. Cell 72, 767–778.
48. Edman, P. (1950). Method for determination of the amino acid sequence in
peptides. Acta Chem. Scand. 4, 283-293.
138 49. James, P. (1997). Proteinidentification in the post-genome era:the rapid rise of
proteomics. Quart. Rev. Biophys. 30, 279-331.
50. Chait, B.T., Wang, R., Beavis, R.C., and Kent, S.B.H. (1993). Protein ladder
sequencing. Science 262, 89-92.
51. Youngquist, R.S., Fuentes, G.R., Lacey, M.P., and Keough, T. (1995). Generation
and screening of combinatorial peptide libraries designed for rapid sequencing by
mass spectrometry. J. Am. Chem. Soc. 117, 3900-3906.
52. Wang, P., Arabaci, G., and Pei, D. (2001). Rapid sequencing of library-derived
peptides by partial Edman degradation and mass spectrometry. J. Comb. Chem. 3,
251-254.
53. Tadjamulia, M.L., Srivastava, P.C., and Knapp, F.F. (1985). Evaluation of the
brain-specific delivery of radioiodinated (iodophenyl)alky-substituted amines
coupled to a dihydropyridine carrier. J. Med. Chem. 28, 1574-1580.
54. Rajagopalan, P.T.R., Grimme, S., and Pei, D. (2000). Characterization of
cobalt(II)-substituted peptide deformylase: function of the metal ion and the
catalytic residue Glu-133. Biochemistry 39, 779-790.
55. Pham, V., Tropea, J., Wong, S., Quach, J., and Henzel, W.J. (2003). High-
throughput protein sequencing. Anal. Chem. 75, 875-882.
56. Mahoney, W.C. (1985). An amino-terminal tryptophan derivative which is
refaractory to Edman degradation. Anal. Biochem. 147, 331-335.
57. Bienvenut, W.V., et al. (2002). Matrix-assisted laser desorption/ionization-
tandem mass spectrometry with high resolution and sensitivity for identification
and characterization of proteins. Proteomics 2, 868-876.
139 58. Pawson, T., and Nash, P. (2003). Assembly of cell regulatory systems through
protein interaction domains. Science 300, 445-452.
59. Bork, P., Schultz, J., and Ponting, C.P. (1997). Cytoplasmic signaling domains:
the next generation. Trends. Biochem. Sci. 22, 296-298.
60. De Souza, D., et al. (2002). SH2 domains from suppressor of cytokine signaling-3
and protein tyrosine phosphatase SHP-2 have similar binding specificities.
Biochemistry 41, 9229-9236.
61. Muller, K., et al. (1996). Rapid identification of phosphopeptide ligands for SH2
domains: screening of peptide libraries by fluorescence-activated bead sorting. J.
Biol. Chem. 271, 16500-16505.
62. Rickles, R.J., et al. (1994). Identification of Src, Fyn, Lyn, PI3K, and Abl SH3
domain ligands using phage display libraries. EMBO J. 13, 5598-5604.
63. Gram, H., Schmitz, R., Zuber, J.F., and Baumann, G. (1997). Identification of
phosphopeptide ligands for the Src-homology 2 (SH2) domain of Grb2 by phage
display. Eur. J. Biochem. 246, 633-637.
64. King, T.R., Fang, Y., Mahon, E.S., and Anderson, D.H. (2000). Using a phage
display library to identify basic residues in A-Raf required to mediate binding to
the Src homology 2 domains of the p85 subunit of phsophatidylinositol 3’-kinase.
J. Biol. Chem. 275, 36450-36456.
65. Sibler, A.-P., Kempf, E., Glacet, A., Orfanoudakis, G., Bourel, D., and Weiss, E.
(1999). In vivo biotinylated recombinant antibodies: high efficiency of labelling
and application to the cloning of active anti-human IgG1 Fab fragmnets. J.
Immunol. Methods 224, 129-140.
140 66. Sugimoto, S., Lechleider, R.J., Shoelson, S.E., Neel, B.G., and Walsh, C.T.
(1993). Expression, purification, and characterization of SH2-containing protein
tyrosine phosphatase, SH-PTP2. J. Biol. Chem. 268, 22771-22776.
67. Freeman, R.M., Jr., Plutzky, J., and Neel, B.G. (1992). Identification of a human
src homology 2-containing protein-tyrosine phosphatase: putative homologue of
Drosophila Corkscrew. Proc. Natl. Acad. Sci. USA 89, 11239-11243.
68. Tridandapani, S., et al. (1997). Recruitment and phosphorylation of SH2-
containing inositol phosphatase and Shc to the B-cell Fcγ immunoreceptor
tyrosine-based inhibitory motif peptide motif. Mol. Cell Biol. 17, 4305-4311.
69. Pei, D., Lorenz, U., Klingmuller, U., Neel, B.G., and Walsh, C.T. (1994).
Intramolecular regulation of protein tyrosine phosphatase SH-PTP1: a new
function for Src Homology 2 domains. Biochemistry 33, 15483-15493.
70. Pei, D., Wang, J., and Walsh, C.T. (1996). Differential functions of the two Src
homology 2 domains in protein tyrosine phosphatase SH-PTP1. Proc. Natl. Acad.
Sci. USA 93, 1141-1145.
71. Sweeney, M.C., and Pei, D. (2003). An improved method for rapid sequencing of
support-bound peptides by partial Edman degradation and mass spectrometry. J.
Comb. Chem. 5, 218-222.
72. Neel, B.G., Gu, H., and Pao, L. (2000). The ‘Shp’ing news: SH2 domain-
containing tyrosine phosphatases in cell signaling. Trends. Biochem. Sci. 28, 284-
293.
73. Ravetch, J.V., and Lanier, L.L. (2000). Immune inhibitory receptors. Science 290,
84-89.
141 74. Waksman, G., et al. (1992). Crystal structure of the phosphotyrosine recognition
domain SH2 of v-src complexed with tyrosine-phosphorylated peptides. Nature
358, 646-653.
75. Eck, M.J., Atwell, S.K., Shoelson, S.E., and Harrison, S.C. (1994). Structure of
the Regulatory Domains of the Src-Family Tyrosine Kinase Lck. Nature 368,
764-769.
76. Lee, C.-H., et al. (1994). Crystal structures of peptide complexes of the amino-
terminal SH2 domain of the Syp tyrosine phosphatase. Structure 2, 423-438.
77. Burshtyn, D.N., Yang, W., Yi, T., and Long, E.O. (1997). A novel
phosphotyrosine motif with a critical amino acid at position -2 for the SH2
domain-mediated activation of tyrosine phosphatase SHP-1. J. Biol. Chem. 272,
13066-13072.
78. Beebe, K.D., Wang, P., Arabaci, G., and Pei, D. (2000). Determination of binding
specificity of the SH2 domains of protein tyrosine phosphatase SHP-1 through the
screening of a combinatorial phosphotyrosyl peptide library. Biochemistry 39,
13251-13260.
79. Liao, H., et al. (2000). Structure of the FHA1 domain of yeast Rad53 and
identification of binding sites for both FHA1 and its target protein Rad9. J. Mol.
Biol. 304, 941-951.
80. Bruhns, P., et al. (2000). Molecular basis of the recruitment of the SH2 domain-
containing inositol 5-phosphatases SHIP1 and SHIP2 by FcγRIIB. J. Biol. Chem.
275, 37357-37364.
142 81. Tangye, S.G., et al. (1999). Cutting edge: human 2B4, an activating NK cell
receptor, recruits the protein tyrosine phosphatase SHP-2 and the adaptor
signaling protein SAP. J. Immunol. 162, 6981-6985.
82. Angata, T., et al. (2002). Cloning and characterization of human Siglec-11. J.
Biol. Chem. 277, 24466-24474.
83. Huber, M., et al. (1999). The carboxyl-terminal region of biliary glycoprotein
controls its tyrosine phosphorylation and association with protein-tyrosine
phosphatases SHP-1 and SHP-2 in epithelial cells. J. Biol. Chem. 274, 335-344.
84. Gavrieli, M., Watanabe, N., Loftin, S.K., Murphy, T.L., and Murphy, K.M.
(2003). Characterization of phosphotyrosine binding motifs in the cytoplasmic
domain of B and T lymphocyte attenuator required for association with protein
tyrosine phosphatases SHP-1 and SHP-2. Biochem. Biophys. Res. Commun. 312,
1236-1243.
85. De Vet, E.C.J.M., Aguado, B., and Campbell, R.D. (2001). G6b, a novel
immunoglobulin superfamily member encoded in the human major
histocompatibility complex, interacts with SHP-1and SHP-2. J. Biol. Chem. 276,
42070-42076.
86. Sui, L., et al. (2004). IgSF13, a novel human inhibitory receptor of the
immunoglobulin superfamily, is preferentially expressed in dendritic cells and
monocytes. Biochem. Biophys. Res. Commun. 319, 920-928.
87. Alvarez-Errico, D., et al. (2004). IREM-1 is a novel inhibitory receptor expressed
by myeloid cells. Eur. J. Immunol. 34, 3690-3701.
143 88. Fanger, N.A., et al. (1998). The MHC class I binding proteins LIR-1 and LIR-2
inhibit Fc receptor-mediated signaling in monocytes. Eur. J. Immunol. 28, 3423-
3434.
89. Cella, M., et al. (1997). A novel inhibitory receptor (ILT3) expressed on
monocytes, macrophages, and dendritic cells involved in antigen processing. J.
Exp. Med. 185, 1743-1751.
90. Cantoni, C., et al. (1999). Molecular and functional characterization of IRp60, a
member of the immunoglobulin superfamily that functions as an inhibitory
receptor in human NK cells. Eur. J. Immunol. 29, 3148-3159.
91. Burshtyn, D.N., et al. (1996). Recruitment of tyrosine phosphatase HCP by the
killer cell inhibitory receptor. Immunity 4, 77-85.
92. Olcese, L., et al. (1996). Human and mouse killer-cell inhibitory receptors recruit
PTP1C and PTP1D protein tyrosine phosphatases. J. Immunol. 156, 4531-4534.
93. Fry, A.M., Lanier, L.L., and Weiss, A. (1996). Phosphotyrosines in the killer cell
inhibitory receptor motif of NKB1 are required for negative signaling and for
association with protein tyrosine phosphatase 1C. J. Exp. Med. 184, 295-300.
94. Lewis, J., et al. (2001). Distinct interactions of the X-linked lymphoproliferative
syndrome product SAP with cytoplasmic domains of members of the CD2
receptor family. Clin. Immunol. 100, 15-23.
95. Xu, M.J., Zhao, R., and Zhao, Z.J. (2000). Identification and characterization of
leukocyte-associated Ig-like receptor-1 as a major anchor protein of tyrosine
phosphatase SHP-1 in hematopoietic cells. J. Biol. Chem. 275, 17440-17446.
96. Carretero, M., et al. (1998). Specific engagement of the CD94/NKG2-A killer
inhibitory receptor by the HLA-E class Ib molecule induces SHP-1 phosphatase 144 recruitment to tyrosine-phosphorylated NKG2-A: evidence for receptor function
in heterologous transfectants. Eur. J. Immunol. 28, 1280-1291.
97. Bottino, C., et al. (2001). NTB-A, a novel SH2D1A-associated surface molecule
contributing to the inability of natural killer cells to kill Epstein-Barr virus
infected B cells in X-linked lymphoproliferative disease. J. Exp. Med. 194, 235-
246.
98. Mousseau, D.D., Banville, D., L'Abbe, D., Bouchard, P., and Shen, S.H. (2000).
PILRα, a novel immunoreceptor tyrosine-based inhibitory motif-bearing protein,
recruits SHP-1 upon tyrosine phosphorylation and is paired with the truncated
counterpart PILRβ. J. Biol. Chem. 275, 4467-4474.
99. Fournier, N., et al. (2000). FDF03, a novel inhibitory receptor of the
immunoglobulin superfamily, is expressed by human dendritic and myeloid cells.
J. Immunol. 165, 1197-1209.
100. Wong, M.X., and Jackson, D.E. (2004). Regulation of B cell activation by
PECAM-1: Implications for the development of autoimmune disorders. Curr.
Pharm. Design 10, 155-161.
101. Zhao, Z.J., and Zhao, R. (1998). Purification and cloning of PZR, a binding
protein and putative physiological substrate of tyrosine phosphatase SHP-2. J.
Biol. Chem. 273, 29367-29372.
102. Zhao, R., and Zhao, Z.J. (2000). Dissecting the interaction of SHP-2 with PZR, an
immunoglobulin family protein containing immunoreceptor tyrosine-based
inhibitory motifs. J. Biol. Chem. 275, 5453-5459.
103. Xu, M.J., Zhao, R., and Zhao, Z.J. (2001). Molecular cloning and characterization
of SPAP1, an inhibitory receptor. Biochem. Biophys. Res. Commun. 280. 145 104. Doody, G.M., et al. (1995). A role in B cell activation for CD22 and the protein
tyrosine phosphatase SHP. Science 269, 242-244.
105. Law, C.-L., et al. (1996). CD22 associates with protein tyrosine phosphatase 1C,
Syk, and Phospholipase C-γ1 upon B cell activation. J. Exp. Med. 183, 547-560.
106. Taylor, V.C., et al. (1999). The myeloid-specific sialic acid-binding receptor,
CD33, associates with the protein-tyrosine phosphatases, SHP-1 and SHP-2. J.
Biol. Chem. 274, 11505-11512.
107. Ikehara, Y., Ikehara, S.K., and Paulson, J.C. (2004). Negative regulation of T Cell
receptor signaling by Siglec-7 (p70/AIRM) and Siglec-9. J. Biol. Chem. 279,
43117-43125.
108. Yu, Z., Lai, C.M., Maoui, M., Banville, D., and Shen, S.H. (2001). Identification
and characterization of S2V, a novel putative siglec that contains two V set Ig-like
domains and recruits protein-tyrosine phosphatase SHPs. J. Biol. Chem. 276,
23816-23824.
109. Fujioka, Y., et al. (1996). A novel membrane glycoprotein, SHPS-1, that binds
the SH2-domain-containing protein tyrosine phosphatase SHP-2 in response to
mitogens and cell adhesion. Mol. Cell Biol. 16, 6887-6899.
110. Veillette, A., Thibaudeau, E., and Latour, S. (1998). High expression of inhibitory
receptor SHPS-1 and its association with protein-tyrosine phosphatase SHP-1 in
macrophages. J. Biol. Chem. 273, 22719-22728.
111. Florio, T., et al. (2000). Somatostatin receptor 1 (SSTR-1)-mediated inhibition of
cell proliferation correlates with the activation of the MAP kinase cascade: role of
the phosphatase SHP-2. J. Physiol. 94, 239-250.
146 112. Okazaki, T., Maeda, A., Nishimura, H., Kurosaki, T., and Honjo, T. (2001). PD-1
immunoreceptor inhibits B cell receptor-mediated signaling by recruiting src
homology 2-domain-containing tyrosine phosphatase 2 to phosphotyrosine. Proc.
Natl. Acad. Sci. USA 98, 13866-13871.
113. Daigle, I., Yousefi, S., Colonna, M., Green, D.R., and Simon, H.-U. (2002). Death
receptors bind SHP-1 and block cytokine-induced anti-apoptotic signaling in
neutrophils. Nature Med. 8, 61-67.
114. Li, C., and Friedman, J.M. (1999). Leptin receptor activation of SH2 domain
containing protein tyrosine phosphatase 2 modulates Ob receptor signal
transduction. Proc. Natl. Acad. Sci. USA 96, 9677-9682.
115. Myers, M.G., Jr., et al. (1998). The COOH-terminal tyrosine phosphorylation
sites on IRS-1 bind SHP-2 and negatively regulate insulin signaling. J. Biol.
Chem. 273, 26908-26914.
116. Kitzig, F., Martinez-Barriocanal, A., Lopez-Botet, M., and Sayos, J. (2002).
Cloning of two new splice variants of Siglec-10 and mapping of the interaction
between Siglec-10 and SHP-1. Biochem. Biophys. Res. Commun. 296, 355-362.
117. Kiener, P.A., et al. (1997). Co-ligation of the antigen and Fc receptors gives rise
to the selective modulation of intracellular signaling in B cells. j. Biol. Chem.
272, 3838-3844.
118. Luque, L.E., Grape, K.P., and Junker, M. (2002). A highly conserved arginine is
critical for the functional folding of inhibitor of apoptosis (IAP) BIR domains.
Biochemistry 41, 13663-13671.
119. Reed, J.C., et al. (2003). Comparative analysis of apoptosis and inflammation
genes of mice and humans. Genome Res. 13, 1376-1388. 147 120. Liston, P., Young, S.S., Mackenzie, A.E., and Korneluk, R.G. (1997). Life and
death decisions: the role of the IAPs in modulating programmed cell death.
Apoptosis 2, 423-441.
121. Huang, Y., et al. (2001). Structural basis of caspase inhibition by XIAP:
Differential roles of the linker versus the BIR domain. Cell 104, 781-790.
122. Renatus, M., Stennicke, H.R., Scott, F.L., Liddington, R.C., and Salvesen, G.S.
(2001). Dimer formation drives the activation of the cell death protease caspase 9.
Proc. Natl. Acad. Sci. USA 98, 14250-14255.
123. Shiozaki, E.N., et al. (2003). Mechanism of XIAP-mediated inhibition of caspase-
9. Mol. Cell 11, 519-527.
124. Chai, J., et al. (2001). Structural basis of caspase-7 inhibition by XIAP. Cell 104,
769-780.
125. Riedl, S.J., et al. (2001). Structural basis for the inhibition of caspase-3 by XIAP.
Cell 104, 791-800.
126. Martins, L.M., et al. (2002). The serine protease Omi/HtrA2 regulates apoptosis
by binding XIAP through a Reaper-like motif. J. Biol. Chem. 277, 439-444.
127. LaCasse, E.C., Baird, S., Korneluk, R.G., and MacKenzie, A.E. (1998). The
inhibitors of apoptosis (IAPs) and ther emerging role in cancer. Oncogene 17,
3247-3259.
128. Huang, Q., et al. (2000). Evolutionary conservation of apoptosis mechanisms:
lepidopteran and baculoviral inhibitor of apoptosis proteins are inhibitors of
mammalian caspase-9. Proc. Natl. Acad. Sci. USA 97, 1427-1432.
148 129. Hawkins, C.J., Ekert, P.G., Uren, A.G., Holmgren, S.P., and Vaux, D.L. (1998).
Anti-apoptotic potential of insect cellular and viral IAPs in mammalian cells. Cell
Death Differ. 5, 569-576.
130. Hay, B.A., Wassarman, D.A., and Rubin, G.M. (1995). Drosophila homologs of
baculovirus inhibitor of apoptosis proteins function to block cell death. Cell 83,
1253-1262.
131. Yang, Y., Fang, S., P., J.J., Weissman, A.M., and Ashwell, J.D. (2000). Ubiquitin
protein ligase activity of IAPs and their degradation in proteasomes in response to
apoptotic stimuli. Science 288, 874-877.
132. Scott, F.L., et al. (2005). XIAP inhibits caspase-3 and -7 using two binding sites:
evolutionarily conserved mechanism of IAPs. EMBO J. 24, 645-655.
133. Srinivasula, S.M., et al. (2001). A conserved XIAP-interaction motif in caspase-9
and Smac/DIABLO regulates caspase activity and apoptosis. Nature 410, 112-
116.
134. Liu, Z., et al. (2000). Structural basis for binding of Smac/DIABLO to the XIAP
BIR3 domain. Nature 408, 1004-1008.
135. Verhagen, A.M., et al. (2000). Identification of DIABLO, a mammalian protein
that promotes apoptosis by binding to and antagonizing IAP proteins. Cell 102,
43-53.
136. ClustalW http://www.ebi.ac.uk/clustalw/.
137. Yang, D., Welm, A., and Bishop, J.M. (2004). Cell division and cell survival in
the absence of survivin. Proc. Natl. Acad. Sci. USA 101, 15100-15105.
138. Reed, J.C., and Bischoff, J.R. (2000). BIRinging chromosomes through cell
division—and survivin’ the experience. Cell 102, 545-548. 149 139. Derijard, B., et al. (1994). JNK1: a protein kinase stimulated by UV light and Ha-
Ras that binds and phosphorylates the c-Jun activation domain. Cell 76, 1025-
1037.
140. Takahashi, R., Deveraux, Q., Tamm, I., Welsh, K., Munt-Assa, N., Salvesen,
G.S., and Reed, J.C. (1998). A single BIR domain of XIAP sufficient for
inhibiting caspases. J. Biol. Chem. 273, 7787-7790.
141. Sun, C., Nettesheim, D., Liu, Z., and Olejniczak, E.T. (2005). Solution structure
of human Survivin and its binding interface with Smac/DIABLO. Biochemistry
44, 11-17.
142. Rodi, D.J., Soares, A.S., and Makowski, L. (2002). Quantitative assessment of
peptide sequence diversity in M13 combinatorial peptide phage display libraries.
J. Mol. Biol. 322, 1039-1952.
143. Feng, T., Tsao, M.-L., and Schultz, P.G. (2004). A phage display system with
unnatural amino acids. J. Am. Chem. Soc. 126, 15962-15963.
144. Deveraux, Q.L., Takahashi, R., Salvesen, G.S., and Reed, J.C. (1997). X-linked
IAP is a direct inhibitor of cell-death proteases. Nature 388, 300-304.
145. Hegde, R., et al. (2003). The polypeptide chain-releasing factor GSPT1/eRF3 is
proteolytically processed into an IAP-binding protein. J. Biol. Chem. 278, 38699-
38706.
146. Verhagen, A.M., et al. (2002). HtrA2 promotes cell death through its serine
protease activity and its ability to antagonize inhibitor of apoptosis proteins. J.
Biol. Chem. 277, 445-454.
150 147. Galvan, V., Kurakin, A.V., and Bredesen, D.E. (2004). Interaction of checkpoint
kinase 1 and the X-linked inhibitor of apoptosis during mitosis. FEBS Lett. 558,
57-62.
148. Li, Q., Liston, P., Schokman, N., Ho, J.M., and Moyer, R.W. (2005). Amsacta
moorei entomopoxvirus inhibitor of apoptosis suppresses cell death by binding
Grim and Hid. J. Virol. 79, 767-778.
149. Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989). Molecular Cloning, 2nd
Edition (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press).
150. Laemmli, U. (1970). Cleavage of structural proteins during the assembly of the
head of bacteriophage T4. Nature 227, 680-685.
151. Holmes, D.S., and Quigley, M. (1981). A rapid boiling method for the preparation
of bacterial plasmids. Anal. Biochem. 114, 193-197.
152. Sanger, F., Nicklen, S., and Coulson, A. (1977). DNA sequencing with chain-
terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467.
151