United States Patent (19) 11 Patent Number: 6,057,101 Nandabalan Et Al
Total Page:16
File Type:pdf, Size:1020Kb
US006057101A United States Patent (19) 11 Patent Number: 6,057,101 Nandabalan et al. (45) Date of Patent: May 2, 2000 54) IDENTIFICATION AND COMPARISON OF 5,525,490 6/1996 Erickson et al. .......................... 435/29 PROTEIN-PROTEIN INTERACTIONS THAT 5,580,721 12/1996 Brent et al. ................................. 435/6 OCCUR IN POPULATIONS AND 5,580,736 12/1996 Brent et al. ................................. 435/6 IDENTIFICATION OF INHIBITORS OF OTHER PUBLICATIONS THESE INTERACTORS Mendelsohn et al., 1994, Curr. Opin. Biotech. 5:482-486. 75 Inventors: Krishnan Nandabalan, Guilford; Jonathan Marc Rothberg, Primary Examiner Eggerton A. Campbell Woodbridge; Meijia Yang, East Lyme; Attorney, Agent, or Firm-Pennie & Edmonds LLP James Robert Knight; Theodore 57 ABSTRACT Samuel Kalbfleisch, both of Branford, all of Conn. Methods are described for detecting protein-protein interactions, among two populations of proteins, each hav 73 Assignee: CuraGen Corporation, New Haven, ing a complexity of at least 1,000. For example, proteins are Conn. fused either to the DNA-binding domain of a transcriptional activator or to the activation domain of a transcriptional 21 Appl. No.: 08/874,825 activator. Two yeast Strains, of the opposite mating type and carrying one type each of the fusion proteins are mated 22 Filed: Jun. 13, 1997 together. Productive interactions between the two halves due to protein-protein interactions lead to the reconstitution of Related U.S. Application Data the transcriptional activator, which in turn leads to the activation of a reporter gene containing a binding site for the 63 Continuation-in-part of application No. 08/663,824, Jun. 14, DNA-binding domain. This analysis can be carried out for 1996. two or more populations of proteins. The differences in the 51 Int. Cl." ................................ C12O 1/68; C12O 1/70 genes encoding the proteins involved in the protein-protein 52 U.S. Cl. ...................................................... 435/6; 435/5 interactions are characterized, thus leading to the identifi 58 Field of Search .................................... 435/516, 91.2, cation of Specific protein-protein interactions, and the genes 435/69.1, 7.37; 536/23.1 encoding the interacting proteins, relevant to a particular tissue, Stage or disease. Furthermore, inhibitors that interfere 56) References Cited with these protein-protein interactions are identified by their ability to inactivate a reporter gene. The screening for Such U.S. PATENT DOCUMENTS inhibitors can be in a multiplexed format where a set of 4,593,002 6/1986 Dulbecco ............................. 435/172.3 inhibitors will be Screened against a library of interactors. 5,198.346 3/1993 Ladner et al. ... 435/69.1 Further, information-processing methods and Systems are 5,223,409 6/1993 Ladner et al. ... ... 435/69.7 described. These methods and systems provide for identifi 5,270,170 12/1993 Schatz et al. .......................... 435/7.37 cation of the genes coding for detected interacting proteins, 5,283,173 2/1994 Fields et al. ................................ 435/6 for assembling a unified database of protein-protein inter 5,322,801 6/1994 Kingston et al. 435/5.01 5,403,712 4/1995 Crabtree et al. .. ... 435/6 action data, and for processing this unified database to obtain 5,468,614 11/1995 Fields et al. ................................ 435/6 protein interaction domain and protein pathway information. 5,482,835 1/1996 King et al.. 5,512,473 4/1996 Brent et al. .......................... 435/240.2 190 Claims, 38 Drawing Sheets U.S. Patent May 2, 2000 Sheet 1 of 38 6,057,101 CONSTRUCT DNA-BINDING &t ACTIVATION DOMAIN FUSION LIBRARES FROM DIFFERENT POPULATIONS (TISSUE/DEVELOPMENT STAGE/DISEASE STAGE) ERFORM YEAST MATING SCREENS TO DENTIFY INTERACTING PARS OF PROTEINS N EACH POPULATION - ARRIVE AT"INTERACTIVE POPULATIONS" CHARACTERIZE INTERACTIVE POPULATIONS BY A COMBINATION OF POOLING STRATEGIES, QEA & SEQ-QEA ARRIVE AT PAIRS OF INTERACTIVE PROTEINS UNIQUE TO APOPULATION SCREEN FOR INHIBITORS OF THESE UNIQUE INTERACTORS USING A YEAST-BASED SELECTION OF INHIBITION OF INTERACTIONS ARRIVE AT LEAD ORUG COMPOUNDS THAT ARE SPECIFICALLY TARGETED AGAINST PAIRS OF INTERACTING PROTEINS UNIQUE TO A PARTICULAR POPULATION FIG.1 U.S. Patent May 2, 2000 Sheet 2 of 38 6,057,101 D-DNA-BINDING DOMAIN US HIS3/URA3AlagZ REPORTER GENE AACMATION DOMAIN X=DNA-BINDING DOMAINFUSION PROTEIN Y=ACTMATION DOMAINFUSION PROTEIN UAS HIS3/URA37 lac2. REPORTER GENE O/o (A) TRANSCRIPTION HIS3 REPORTER GENE GROWTH ON MEDIA THAT MONITORING (3-GALACTOSIDASE SELECTS FOR INTERACTING ACTMTY OF INTERACTING PROTEINS PROTEINS (e.g-HIS OR -URA MEDIUM) FIG.2 U.S. Patent May 2, 2000 Sheet 3 of 38 6,057,101 MXN SCREEN M&N ARE TWO POPULATIONS OF PROTEINS SPECIFIC TO A PARTICULAR STATE (e.g., CANCER) GROWTH ON -URA -HIS +3-AT MEDIUM (SELECTS FOR INTERACTORS) AND SCREEN FOR 6-GALACTOSIDASE ACTIVITY CREATION OF INTERACTIVE POPULATION TWO- OR THREE DIMENSIONAL POOLING STRATEGES TO FORM POOLS OF ALL INTERACTING PAIRS YEAST COLONY PCR FOR M & N INSERTS ON POOLS M-PCR N-PCR QEA & a CHARACTERIZE PARS OF c PROTEINS IN ONE DISEASE-STATE PERFORMANOTHER MXN ANALYSIS FOR A DIFFERENT STATE (e.g., NORMAL) COMPARISON OF THE TWO M x N ANALYSES EADS TO THE DENTIFICATION OF PARS OF INTERACTING PROTEINS SPECIFIC TO CERTAIN STATE (e.g., CANCER) FG3 U.S. Patent May 2, 2000 Sheet 4 of 38 6,057,101 Gel IMAGES 1A,B,C,D 1A2A3A4A PLATE POOLS p12 H p1/ROW POOLS A COLUMN POOLS FIG.4B U.S. Patent May 2, 2000 Sheet 5 of 38 6,057,101 2 3 4 5 6 7 8 9 10 11 12 ROW POOLS A-H 3 4 5 6 7 8 9 10 11 12 FG .4C U.S. Patent May 2, 2000 Sheet 6 of 38 6,057,101 My N SCREEN M & N ARE TWO POPULATIONS OF PROTEINS SPECIFIC TO A PARTICULAR STAGE (e.g., CANCER) GROWTH ON-URA MEDIUM (SELECT FOR INTERACTORS) YEAST COLONY PCR FOR M & N INSERTS M-PCR N-PCR STORE HALF OF THE PCR PRODUCTS COMBINE THE REMAINING HALVES IN AN ARRAYED FASHION TO CREATE PCR PAIRS POOL. ALL THE PCR PRODUCTS SPOT ONE-HALF OF EACH \| PCR PAIR ONTO NYLON MEMBRANES f & SEQ-QEA INTERACTIVE GRIDS IDENTIFY DIFFERENCE BANDS EACH SPOT CORRESPONDS TO GENES ENCODING A PAR OF INTERACTING PROTEINS USE TO PROBE INTERACTIVE GRIDS INTERACTIVE GRIDS IDENTIFY PARS OF INTERACTIVE PROTEINS THAT ARE STAGE-SPECIFIC FIG.5 U.S. Patent May 2, 2000 Sheet 7 of 38 6,057,101 N CHEMICALS X M INTERACTANTS N XM SCREENS NORMAL x NORMAL CANCER X CANCER NORMAL X CANCER POOL ALL POSITIVE INTERACTANTS (COLONIES) EACH O REPRESENTS THE POOL OF STAGE-SPECIFIC INTERACTANTS SCREEN FOR INHIBITION BY SMALL ORGANICS USING THE INHIBITION-DEPENDENT GROWTH ASSAY N O R M AL X NORMAL CANCER X CANCER NORMAL O O O O O O O O () O O O O O O O O O O O AT THIS STEP EACH O RECEIVES A DIFFERENT CHEMICAL O INHIBITION DEPENDENT GROWTH > ATUSED THIS CAN STEP BE THEENCODED INHIBITORS SO AS TO o NO GROWTH (NO INHIBITION) INCREASE CHEMICAL DIVERSITY (COMBINATORIAL APPROACHES CAN BE USED HERE) ARRIVE AT STAGE-SPECIFIC (NORMAL/DISEASE) INTERACTING PARS OF PROTEINS AND SMALL MOLECULAR INHIBITORS OF THESE INTERACTIONS FIG.6 U.S. Patent May 2, 2000 Sheet 8 of 38 6,057,101 ADC1-P PEPTIDE NLS UAGADC1-T ES32 Sfi I Asc I FIG.7 U.S. Patent May 2, 2000 Sheet 9 of 38 6,057,101 te areer s a. r rate O rise. O t U.S. Patent May 2, 2000 Sheet 10 Of 38 6,057,101 as e - r - - as - as . - - - - - - - - T - - - we - - - - - - - - - - - - - - - - arra-wis well rraraval a- - - we was sawa a m me m m. m s t e N cf. e t -- U.S. Patent May 2, 2000 Sheet 11 of 38 6,057,101 210 205 2O7 206 FG.10A 5'-AGC ACT CTC CAG CCT CTC ACC GAA-3' (SEQ ID NO: 1) 250 3'- AG TGG CTT CTAG-5 (SEQ ID NO:7) 5'-ACC GAC GTC GAC TAT CCA TGA AGC-3' (SEQ ID NO:42) 25, 3'-GT ACT TCG TCGA-5' (SEQ ID NO:44) FG, 10B 211 212 FIG.10C U.S. Patent May 2, 2000 Sheet 12 of 38 6,057,101 S U.S. Patent May 2, 2000 Sheet 13 Of 38 6,057,101 W||'0|| (ÇG'ONQIDES)01x00\,000\,0101000,001010700W-G 8||'0|| Z09)909) 0100010\–?º U.S. Patent May 2, 2000 Sheet 14 of 38 6,057,101 407. - - - - - 408 CB) 405 405 406 404 FG, 12A 421 409 422 U.S. Patent May 2, 2000 Sheet 15 of 38 6,057,101 1002 1003 SELECTED (HUMAN goNA) DB, 1004 1005 SELECTED DB. (PREDICTED COMPLETE EXPRESSED DATABASE SEQUENCES) 1007 (DB) 1006 SELECTED DB. (cDNACDS,EST) 1008 1009 SELECTED (CERTAIN DB. EXPRESSED SEQUENCES) F.G. 13A 1011 1012 1013 1014 HUMAN m TACT . U.S. Patent May 2, 2000 Sheet 16 of 38 6,057,101 1101 1102 RE 1103 REACTION 1 RE LABEL RE2 1104 REACTION 2 1105 1106 RE2 1107 RE3 1108 PROBE P ABEL4 1112 REACTION 15 1110 1111 1100 F.G. 14 U.S. Patent May 2, 2000 Sheet 17 Of 38 6,057,101 NOLIWTITWIS S00HIEW ZOZ! No.BJ|tg ||| U.S. Patent May 2, 2000 Sheet 18 of 38 6,057,101 N O O V CN ves O Cy s U.S. Patent May 2, 2000 Sheet 19 of 38 6,057,101 SAE) 1301 INPUT: A SEQ., TARGET SUB- 1302 SEQUENCES; PROBE FOREACH SUB- 1303 SEQ, MAKE VECTOR OF CUTS FOR ALL SUB SEQ, MERGE 1304 AND SORT VECTORS CREATE 1305 FRAGMENTS BETWEEN CUTS 1506 PROBE PRESENT YES NO FIG.16 FOR EACH FRAGMENT, FIND ITS SEQ.