Building a human repository: Bioinformatics, molecular cloning, and functional validation

Jaehong Park*, Yanhui Hu*, T. V. S. Murthy*, Fredrik Vannberg*, Binghua Shen*, Andreas Rolfs*, Jessica E. Hutti†, Lewis C. Cantley†, Joshua LaBaer*, Ed Harlow*, and Leonardo Brizuela*‡

*Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 320 Charles Street, Cambridge, MA 02141; and †Department of Systems Biology, Harvard Medical School, 330 Brookline Avenue, Boston, MA 02115

Contributed by Ed Harlow, April 21, 2005 catalyze the phosphorylation of , lipids, sugars, creation and use of state-of-the-art clone collections that exploit the nucleosides, and other important cellular metabolites and play key newly obtained genome sequence and gene annotation. In the most regulatory roles in all aspects of eukaryotic cell physiology. Here, useful collections, clones would represent fully sequenced-verified we describe the mining of public databases to collect the sequence ORFs, make use of recombination-based cloning techniques, and information of all identified human kinase and the cloning be arrayed in high-density formats where all positions are fully of the corresponding ORFs. We identified 663 genes, 511 encoding annotated (16, 17). All these properties will allow high-throughput kinases, and 152 encoding nonprotein kinases. We describe (HT) subcloning of the genes in these collections, as well as the successful cloning and sequence verification of 270 of these facilitate experimentation (in any in vivo and in vitro system) and genes. Subcloning of this gene set in mammalian expression data collection͞analysis (both positive and negative data). vectors and their use in high-throughput cell-based screens al- In this study, we describe the construction and proof of principle lowed the validation of the clones at the level of expression and use of such a collection for the human kinase genes. We describe the identification of previously uncharacterized modulators of the the mining of public databases to identify all annotated human survivin promoter. Moreover, expressions of the kinase genes in kinases (including protein and nonprotein kinases) and the gener- bacteria, followed by autophosphorylation assays, identified 21 ation of a sequenced verified clone collection for this gene set by protein kinases that showed autocatalytic activity. The work de- using the CREATOR (BD Biosciences Clontech) cloning platform. scribed here will facilitate the functional assaying of this important We furthermore validated the expression of these clones, success- gene family in phenotypic screens and their use in biochemical and fully screened their activity en masse in three independent cell- structural studies. based assays, and confirmed enzymatic activity for some of those proteins in Escherichia coli. As we demonstrate here, this human kinome ͉ autophosphorylation ͉ cell-based screens ͉ high-throughput kinase clone set will facilitate the study of this important gene class cloning ͉ survivin both in in vivo and in vitro settings.

he term kinase refers to a large number of mechanistically, Materials and Methods Tstructurally, and evolutionary distinct classes of . They Database Mining. To assemble the kinase list, LocusLink informa- catalyze the transfer of the ␥-phosphate from nucleoside triphos- tion was downloaded from the National Center for Biotechnology phates to a large number of molecules, including proteins, sugars, Information web site. Structural Query Language (SQL) was de- nucleosides, and lipids, and affect the activity and fate of those signed to query genes with proper GO͞EXTANNOT annotation, CDD molecules and the cell. annotation, or proper nomenclature. See Supporting Materials and Phosphorylation is a common posttranslational modification of Methods, which is published as supporting information on the proteins and plays a key role on protein structure and function and PNAS web site. in all aspects of cell physiology. Protein kinases contain well conserved motifs and constitute the largest family of proteins in the Molecular Cloning. PCR amplification and cloning was carried out (1–3). Mutations of protein kinases are involved in by using a highly automated and laboratory information manage- carcinogenesis and several other pathological conditions (4–6). ment system (LIMS)-supported pipeline by using CREATOR recom- Phosphorylations of other biomolecules also play a critical role in bination-based cloning technologies (BD Biosciences Clontech) the physiology and pathology of cells. Lipid kinases such as the (see Supporting Materials and Methods). phosphoinositide-3 kinase family members are key modulators of the cellular response to growth factors, hormones, and neurotrans- Generating Expression-Ready Libraries. ORFs were subcloned from mitters and are involved in cancer (7). Nucleotide and nucleoside the pDNR-Dual master vector into mammalian or bacterial ex- kinases regulate the intracellular levels of phosphate donors and pression vectors. For mammalian expression, pLP-CMVneo, pLP- nucleic acid precursors and are involved in the cellular response to EGFP-C1, and pLPS-3ЈEGFP vectors (BD Biosciences Clontech) damage and ischemia (8, 9). Sugar kinases regulate the rates of were chosen for native, N-, and C-terminal EGFP-tagged version sugar metabolism, energy generation, and transcription activation for each kinase, respectively. For bacterial expression, pGEX2tk and are involved in the process of cellular transformation and (Amersham Pharmacia Biotech) was adapted for recombinational apoptosis (10–12). cloning (see Supporting Materials and Methods). The near completion of the Human Genome Project, the ongo- ing annotation projects, and the availability of sequence databases has allowed the genome-scale search and identification of members Abbreviations: HT, high throughput; MGC, Mammalian Gene Collection; shRNA, short- of different gene families by using sequence information as well as hairpin RNA; TCF, T cell factor. structural or functional annotations (2, 3, 13–15). However, a Data Deposition: The gene constructs reported in this paper have been deposited in the systematic cloning, sequence analysis, and functional validation GenBank database (accession nos. AY335555–AY335786). effort for any of these gene sets has been challenging. Indeed, a ‡To whom correspondence should be addressed. E-mail: [email protected]. major goal for experimental biology in this postgenomic era is the © 2005 by The National Academy of Sciences of the USA

8114–8119 ͉ PNAS ͉ June 7, 2005 ͉ vol. 102 ͉ no. 23 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0503141102 Downloaded by guest on September 27, 2021 Table 1. Classification of the human kinase genes identified and efforts. Because of the fact that the information at LocusLink cloning success rate in this study regarding gene sequences and annotations changes rapidly, we Successfully subsequently repeated the query by using 3 LocusLink updates. No. of cloned Success Our most recent version of the human kinase gene set, based Type Group genes genes rate % on the June 2004 analysis, consists of 663 genes. 511 of the genes (77%) encode for protein kinases and 152 genes (23%) for Human kinome 663 270 40.7 nonprotein kinases (Table 1). Genes encoding protein kinases 511 186 36.4 were further classified in groups according to the extended AGC 63 24 38.1 classification of protein kinases (3). Nonprotein kinases com- Atypical 40 13 32.5 prise 23% of all annotated human kinases and are composed of CAMK 73 30 41.1 heterogeneous groups of enzymes from the point of view of CK1 12 7 58.3 substrate specificity, gene sequence, and protein fold (ref. 21; CMGC 62 33 53.2 Consortium, www.godatabase.org). Data Set 1, RGC 5 0 0.0 which is published as supporting information on the PNAS web Other 78 24 30.8 site, contains all relevant information for each of the 663 STE 45 18 40.0 identified genes. TK 90 22 24.4 Contemporaneous to our initial analysis, Kostich et al. (2) and TKL 43 15 34.9 Manning et al. (3) described the identification of 510 and 518 Nonprotein kinase 152 84 55.3 human genes encoding for protein kinases by using sequence alignments and pair-wise comparisons. Comparison of the data from Manning et al. with our June 2004 search results for the Mammalian Expression and Cell-Based Screens. Cotransfections of protein kinase subset indicates that our current list is missing expression clones and reporter constructs were done by using eight of the genes identified by Manning et al. Seven of these FuGene6 (Roche Molecular Biochemicals) in a 96-well format. genes (SK573, SK581, SK592, SK650, SK681, SK707, and SK723) Reporter activity was measured by using luciferase reporter were associated with LocusLink records that had been retired in assay and Great EscAPe SEAP detection kits (BD Biosciences the June 2004 update. Lastly, SK200 did not have a correspond- Clontech) (see Supporting Materials and Methods). ing full-length GenBank record at the time. The high degree of coverage obtained with these three studies suggests a general Bacterial Expression and Autophosphorylation. For more informa- agreement on the composition of the human protein kinase gene tion on bacterial expression and autophosphorylation, see Sup- family, at least based on available information. porting Materials and Methods. Cloning into Recombinational Plasmid Vectors. We have developed Results and Discussion a laboratory information management system-supported highly automated cloning and validation pipeline for genome-scale Database Mining. Annotation and curation of the human genome cloning by using recombination-based cloning technologies (16, as well as its mining based on sequence and motif conservations 17). Once we had accumulated the sequence and annotation of have been the subject of large and continuous efforts (2, 3, the human kinase genes in our relational databases, we pro- 18–20). To identify and collect sequence information of all of the ceeded to the HT PCR amplification and cloning of these genes. identified and annotated human kinase genes, we performed For the work described here, we initially targeted the human term-based queries of public databases. We downloaded all of kinase genes whose ORF size was Ͻ4 kb (594 genes). Two the LocusLink records available for human genes and used it to strategies were used to clone the targeted genes. Kinase genes perform three independent SQL queries by using functional present in the Mammalian Gene Collection (MGC, March 2003 annotation, structural annotation, and gene nomenclature, re- release) were used as template for amplification (152 genes). The SCIENCES spectively. The query results were then merged and hand-

rest of the genes (442) were amplified by using a first-strand APPLIED BIOLOGICAL curated to eliminate kinase-related genes not captured by our cDNA pool produced in the laboratory from normal human filters (including inhibitors, regulators, and subunits). Because placenta and brain tissues. PCR products of the expected sizes most genes were represented by multiple GenBank accession were generated for 99% of the genes amplified from MGC numbers, representing alternative spliced forms of the genes, we templates and for 46% of those amplified from the first-strand selected the longest coding DNA sequences from the RefSeq and cDNA pool. Amplification of the additional genes is taking UniGene databases as the reference sequence for our cloning advantage of information on mRNA abundance on the different

Fig. 1. Expression of C-terminal EGFP-tagged kinases in HEK293T cells. Normalized fluorescence of the C-terminal EGFP-tagged kinases in HEK293T cells, determined by plate reader, and fluorescence microphotographs of representative wells are shown.

Park et al. PNAS ͉ June 7, 2005 ͉ vol. 102 ͉ no. 23 ͉ 8115 Downloaded by guest on September 27, 2021 Fig. 2. Screen for kinases that regulate the survivin promoter. (A) Activity of C-terminal EGFP-tagged kinases on the survivin promoter in HEK 293T cells. (B) Scatter plot of z values obtained from interday experiments in A.(C) Characterization of survivin promoter induction by using top activating and inhibiting kinases identified from the screen.

tissues, on the use of alternative cDNA libraries, recent additions significantly shorter than the targeted reference sequence, al- to the MGC repository, and on new amplification protocols. though the clone sequence matched the MGC template. The One of the major quality-control points in our clone produc- main reasons for failures in the first-strand cDNA group were the tion pipeline is the full sequence analysis of the amplified ORFs. presence of deleterious mutations in the clones, such as nonsense This analysis allows the detection and elimination of clones with mutations, frame-shift mutations, multiple missense mutations, high sequencing-confidence discrepancies with respect to the and mutations in linker region introduced by the PCR primers. reference sequence due to the introduction of mutations during The findings highlight the need and importance of full-length the PCR amplification. We eliminated any clone with nonsense sequence validation of the clones produced in these types of or frame-shift mutations or with more than one missense operations. mutation (after disregarding reported nucleotide polymor- The number of accepted genes from this first pass of cloning phisms). We successfully cloned and accepted 73% and 55% of and sequence analysis was 270, corresponding to 186 protein the PCR products obtained with the MGC template and the kinases and 84 nonprotein kinases. The coverage of protein first-strand cDNA library, respectively. Most of the rejected kinase groups varied from 24% to 58% for the TK and CK1 clones in the MGC group were represented by clones that were groups. For the nonprotein kinases, we successfully cloned 55%

8116 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0503141102 Park et al. Downloaded by guest on September 27, 2021 Fig. 3. Screen and characterization of differentially tagged kinases on the TCF͞lymphoid enhancing factor responsive element. (A) Effect of the differentially tagged kinases on the expression of the SEAP reporter protein in HEK 293T cells. Characterized kinases are marked by arrowheads. (B) Specificity of the TCF modulation by comparing the activity of the kinases on wild-type c-myc promoter (TOP) (Upper) and scrambled TOP (FOP) (Lower). SCIENCES

of the targeted genes. GenBank accession numbers AY335555- 1). The same positive clones identified by the plate reader also APPLIED BIOLOGICAL AY335786 have been obtained for the new clones, and the clone scored positive upon microscopic analysis (see Fig. 1 for repre- collection is now available from multiple distributors including sentative samples). There were reproducible differences of GFP RZPD, MRC, and the Dana–Farber͞Harvard Cancer Center fluorescence signal (up to 86-fold) among different clones, with ‘‘DNA Resource Core.’’ no significant correlation between complete DNA sequences There are several qualities of this kinase gene collection that size and GFP fluorescence. Furthermore, microscopic analysis make it relevant. The clones in the collection are represented by of the transfected cells allowed us, in most cases, to identify the fully sequence-verified human ORFs cloned in a recombination- cellular distribution of the recombinant kinases (data not based cloning system, arrayed in multiwell plates, fully indexed, shown). For example, the recombinant forms of Src family and annotated. All these properties facilitate the parallel cre- members (CSK, LYN, and YES1) and Tec family members (ITK, ation of fully representative expression libraries, the assaying of BTK, and BMX) showed cytoplasmic or plasma membrane localization, as expected (22, 23). Also, recombinant PIP5K1A the clone collection, and analysis of the resulting data in a and B, ACVR1, and GPRK5 were mainly localized in the cell comprehensive and HT fashion, as shown below. membrane (24, 25) (Fig. 5, which is published as supporting information on the PNAS web site). Recombinant PLK, Expression of Kinases in Mammalian Cells. To validate the expres- CHEK1, and CHEK2 were localized in the nucleus, consistent sion and activity of the kinase clones in cells, we transferred the with the temporal association of these proteins with this cellular kinase genes into three mammalian expression vectors (see compartment (26, 27). GFP fluorescence is only an indirect Materials and Methods). We then made use of the C-terminal measurement of the expression of the recombinant kinases. EGFP-tagged kinase library to analyze the level of expressions However, analysis of the relative levels of fluorescence and of 223 clones, based on the fluorescence level of the tagged immunoblot signal (by using anti-GFP antibodies), for a subset kinases. HEK 293T cells were transfected in triplicate in 96-well of genes identified in the screen described below, indicated a format with the kinase constructs, and 94% of the clones were good correlation between fluorescence signal and protein levels found to be positive for GFP fluorescence, compared with those (Fig. 6, which is published as supporting information on the of the empty C-terminal EGFP vector with a plate reader (Fig. PNAS web site).

Park et al. PNAS ͉ June 7, 2005 ͉ vol. 102 ͉ no. 23 ͉ 8117 Downloaded by guest on September 27, 2021 Screens of Clones in Cell-Based Assays. After having validated the expression of the library in mammalian cells, the human kinase expression clone sets were screened in two independent HT cell-based reporter gene assays. In the first screen, we looked for kinase genes with capacity to modulate the survivin promoter element. Survivin (BIRC5) is involved in the inhibition of apoptosis, and its expression is up-regulated in most cancer cells (28). Although survivin levels are controlled at the transcrip- tional level in a cell cycle-restricted manner (29), mounting evidence indicates that several oncogenic pathways might also regulate its transcription (30). Each of the 223 C-terminal EGFP-tagged kinase clones analyzed before were then cotrans- fected with a luciferase reporter construct containing survivin promoter region (Ϫ1430 to Ϫ1), pLuc-1430c (29) in HEK 293T cells in triplicate. Normalized luciferase activity data from two independent experiments are shown in Fig. 2A. A scatter plot of the z values from the two independent assays indicates a good Fig. 4. Autoradiograph and Western blot of selected active kinases pro- level of correlation and a high degree of reproducibility of the duced in E. coli.(A) Autoradiograph of bacterial lysates separated by SDS͞ kinase-induced effect (r ϭ 0.92, Fig. 2B). PAGE after kinase assay. The active bands corresponding to the expected We then selected the eight genes that showed the highest protein size were marked by arrowheads. (B) Western blot analysis of the positive modulatory activity (ADK, ATR, MAPK1, MAP2K5, bacterial lysates corresponding to those used in A by using anti-GST antibod- PFKM, PRKR, STK10, and STK22C), as well as four genes ies. Expected size bands were marked by arrowheads. Molecular size markers showing inhibitory activity (BLK, HRI, MAP3K7, and PIM1) (Mr) were indicated on the left. for further analysis. All eight activating kinases significantly up-regulated the survivin promoter activity in the confirmatory signaling, but CK1-␣, CK1–␥, and CK2-␣ could not induce any C experiments, showing 3- to 32-fold induction (Fig. 2 ). Likewise, significant activation of ␤-catenin reporter (36–38). the four inhibitory kinases reproducibly down-regulated the basal survivin promoter activity. The magnitude of inhibition Bacterial Expression and Purification for Autophosphorylation Assay. induced by these kinases was uniform and limited to 0.6-fold Bacterial expression represents a simple and cost-effective sys- inhibition (Fig. 2C). Importantly, consistent with the antiapo- tem for expression and purification of proteins. However, this ptotic role of survivin, expressions of all of the activator kinases, system is often not suitable for expression of properly folded, except PFKM, were found to protect cells from TNF-related posttranslationally modified, and active mammalian enzymes. apoptosis-inducing ligand (TRAIL)-induced apoptosis (data not Nevertheless, even in the case of protein kinases, it is possible to shown). Consistent with our results, both MAP2K5 and MAPK1 have been found to inhibit TRAIL-induced apoptosis (31, 32) produce properly folded as well as active enzymes in bacteria and MAP3K7 (TAK1) has been reported to induce apoptosis (39–41). Cumulative data from testing individual mammalian through JNK (33) and p38 activation (34). Our initial screen and kinases in our laboratory indicates that 5–10% of those genes follow-up experiment has also allowed the identification of expressed active enzymes in bacteria. We thus took advantage of genes, including the unexpected results with the nucleoside the large number of kinases present in our set, and our ability to kinases ADK, that clearly regulate the survivin promoter and process them in parallel, to identify previously uncharacterized protect cells from TRAIL-induced apoptosis and that have not kinase genes that produce active enzymes in E. coli in a quick and been reported before. efficient manner. Autophosphorylation of the recombinant pro- In a second independent cell-based screen, we made use of tein in the total bacterial extract was taken as a measure of kinase three different forms of the kinases (native, N- and C-terminal activity for any given protein. Twenty-one of 233 recombinant EGFP-tagged) and tested them for their ability to up-regulate kinases tested (9%) showed strong kinase activity in both the the T cell factor (TCF)͞lymphoid enhancing factor-binding original screen (data not shown) and in the retest. These region of the c-myc promoter (see Materials and Methods), a well corresponded to BCKDK, BMX, CKLiK, CLK4, DAPK3, FRK, characterized responsive element involved in the WNT signaling HCK, HRI, LYN, OSR1, PDK3, PHKG2, PKMYT1, PLK, SNK, pathway (35). As shown in Fig. 3A, significant up-regulation of STK3, STK16, STK17A, STK38L, VRK1, and VRK2. Fig. 4 the normalized c-myc promoter–secreted alkaline phosphatase shows the autophosphorylation and anti-GST immunoblot re- reporter activity was achieved by 5 of the 699 constructs tested sults for 11 of the identified kinases. Among the 21 enzymes (233 kinases in three different expression vectors). Only one of identified in this analysis were representatives of all of the the genes identified by the five strongest hits was represented by protein kinase groups (3), indicating that, in principle, it is more than one construct (CK1-␦), the other two genes uncov- possible to obtain active kinases in bacteria for proteins in any ered in the screen (CK1-␧ and PRKR) were identified by only of 10 protein kinase groups. Some of the enzymes identified here one of the three alleles generated for each one of those genes. have also been described in ref. 41. Our results will aid in the These results clearly highlight the importance of generating and establishment of enzymatic assays for determination of the testing genes with alternative tags, thus generating different substrate specificities for the newly identified active kinases (42) alleles for each gene, to enhance the chances of identifying the as well as helping in the definition of the specificity of kinase involvement of a gene in a given pathway. To further validate inhibitors. Finally, we have also identified active enzymes (such whether the activity of these constructs was specific to the as PKMYT1) that show sufficient expression levels for structural TCF͞lymphoid enhancing factor responsive element, we com- studies. pared their activity and that of some negative controls against the wild type and a point-mutated scrambled TCF responsive ele- Summary. Complete exploitation of the human genome sequence ment, FOP (35). All but the C-terminal EGFP-tagged PRKR requires the building of state-of-the-art and comprehensive ORF showed specific activation of the TCF responsive element (Fig. repositories at both the gene-family and genome scales. These 3B). Our findings are consistent with reports that the overex- types of repositories will serve not only as validation tools but pression of CK1-␦ or -␧ are sufficient to activate the WNT also as unique HT discovery tools in the form of clones (for

8118 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0503141102 Park et al. Downloaded by guest on September 27, 2021 cell-based screens) and proteins (for biochemical and immuno- shRNA constructs in that collection. Another positive step could logical assays). In this study, we have concentrated on the be to generate dominant negative (kinase-dead) alleles for those definition of the human kinase gene set and on the initial cloning protein kinases whose wild-type alleles have been captured so and characterization of this important clone collection. Expres- far. Coordinated use of the wild-type kinase, the dominant sion of the obtained kinase set in both in vivo and in vitro assays, negative, and the shRNA collections will facilitate the identifi- in a HT manner, has allowed the validation of the kinase cation and understanding of the role of the human kinase genes constructs and has demonstrated the value of this gene set for in any phenotypic assay. Furthermore, construction of wild-type cell-based and biochemical screens. Some of the hits identified and alternative tagged forms of the genes results in the creation in the two cell-based validation screens correlate with the of alleles with differential activities further enhancing the published literature. Furthermore, some of the hits in the screen chances of identifying hits in a given cell-based assay by using for regulators of the survivin promoter suggested new regulatory gene overexpression screens. elements that also affect the apoptotic response of cells. Some The kinase clones described here add to the growing number of these genes encode sugar and nucleoside kinases and require of recently generated human ORF and arrayed cDNA clone further analysis. collections, which allow indexed experimentation at the gene Use of the clones in biochemical assays also allowed the family or subgenome scales (38, 44, 45). Future challenges on this identification of 21 kinase genes that possess autocatalytic area are the completion of relevant gene-family ORF collections activity when expressed in bacteria. These kinases could prove (13–15) (to include all genes in a given gene family and all important for use in in vitro studies to define substrate alternative spliced forms of every gene) and the production of specificities (by using peptide libraries) and to profile kinase clone isolates with full-length sequence verification (such as the inhibitors. clones in this study) for all predicted human ORFs. In addition to the wild-type kinase described here, we recog- nized the value of generating both a short-hairpin RNA We thank Drs. Bert Vogelstein for pTOP-luc and pFOP-luc (The Johns (shRNA) library and a ‘‘dominant negative’’ library for the Hopkins University, Baltimore) Dario Altieri for pLuc1430c (University human kinase gene set. Consistently, the information generated of Massachusetts Medical School, Worcester, MA), Xi He for helpful here contributed to the construction of the shRNA library comments, and Greg Hannon for generation of shRNA. This work was described by Paddison et al. (43) such that 80% of the human partially funded by the National Cancer Institute–Frederick Research kinase genes identified here have been covered by multiple Development Center Subcontract 22XS136A.

1. Hanks, S. K., Quinn, A. M. & Hunter, T. (2002) Science 241, 42–52. 24. Chou, M. M., Hou, W., Johnson, J., Graham, L. K., Lee, M. H., Chen, C. S., 2. Kostich, M., English, J., Madison, V., Gheyas, F., Wang, L., Qiu, P., Greene, Newton, A. C., Schaffhausen, B. S. & Toker, A. (1998) Curr. Biol. 8, 1069–1077. J. & Laz, T. M. (2002) Genome Biol. Res. 3, 0043.1–0043.12. 25. Penela, P., Ribas, C. & Mayor, F., Jr. (2003) Cell Signalling 15, 973–981. 3. Manning, G., Whyte, D. B., Martinez, R., Hunter, T. & Sudarsanam, S. (2002) 26. Oe, T., Nakajo, N., Katsuragi, Y., Okazaki, K. & Sagata, N. (2001) Dev. Biol. Science 298, 1912–1934. 229, 250–261. 4. Cohen, P. (2001) Eur. J. Biochem. 268, 5001–5010. 27. Tsvetkov, L., Xu, X., Li, J. & Stern, D. F. (2003) J. Biol. Chem. 278, 8468–8475. 5. Bishop, J. M. (1991) Cell 64, 235–248. 28. Altieri, D. C. (2003) Nat. Rev. Cancer 3, 46–54. 6. Blume-Jensen, P. & Hunter, T. (2001) Nature 411, 355–365. 29. Li, F. & Altieri, D. C. (1999) Biochem. J. 344, 305–311. 7. Cantley, L. C. (2002) Science 296, 1655–1657. 30. Zhang, T., Otevrel, T., Gao, Z., Gao, Z., Ehrlich, S. M., Fields, J. Z. & Boman, 8. Kowaluk, E. A., Bhagwat, S. S. & Jarvis, M. F. (1998) Curr. Pharm. Des. 4, B. M. (2001) Cancer Res. 61, 8664–8667. 403–416. 31. Aza-Blanc, P., Cooper, C. L., Wagner, K., Batalov, S., Deveraux, Q. L. & 9. Rasouli-Nia, A., Karimi-Busheri, F. & Weinfeld, M. (2004) Proc. Natl. Acad. Cooke, M. P. (2003) Mol. Cell 12, 627–637. Sci. USA 101, 6905–6910. 32. Tran, S. E., Holmstrom, T. H., Ahonen, M., Kahari, V. M. & Eriksson, J. E. 10. Bustamante, E., Morris, H. P. & Pedersen, P. L. (1981) J. Biol. Chem. 256, (2001) J. Biol. Chem. 276, 16484–16490. 8699–8704. 33. Yang, X., Kovalenko, D., Nadeau, R. J., Harkins, L. K., Mitchell, J., Zubanova, 11. Smith, T. A. (2000) Br. J. Biomed. Sci. 57, 170–178.

O., Chen, P. Y. & Friesel, R. (2004) J. Biol. Chem. 279, 38099–38102. SCIENCES 12. Majewski, N., Nogueira, V., Bhaskar, P., Coy, P. E., Skeen, J. E., Gottlob, K., 34. Edlund, S., Bu, S., Schuster, N., Aspenstrom, P., Heuchel, R., Heldin, N. E., ten Chandel, N. S., Thompson, C. B., Robey, R. B. & Hay, N. (2004) Mol. Cell 16,

Dijke, P., Heldin, C. H. & Landstrom, M. (2003) Mol. Biol. Cell 14, 529–544. APPLIED BIOLOGICAL 819–830. 35. He, T. C., Sparks, A. B., Rago, C., Hermeking, H., Zawel, L., da Costa, L. T., 13. Gebauer, M., von Melchner, H. & Beckers, T. (2001) Genome Res. 11, Morin, P. J., Vogelstein, B. & Kinzler, K. W. (1998) Science 281, 1509–1512. 1871–1877. 36. Cadigan, K. M. & Nusse, R. (1997) Genes Dev. 11, 3286–3305. 14. Kawasawa, Y., McKenzie, L. M., Hill, D. P., Bono, H., RIKEN GER Group, 37. Giles, R. H., van Es, J. H. & Clevers, H. (2003) Biochim. Biophys. Acta 1653, GSL Members & Yanagisawa, M. (2003) Genome Res. 13, 1466–1477. 1–24. 15. Alonso, A., Sasin, J., Bottini, N., Friedberg, I., Friedberg, I., Osterman, A., 38. Liu, J., Bang, A. G., Kintner, C., Orth, A. P., Chanda, S. K., Ding, S. & Schultz, Godzik, A., Hunter, T., Dixon, J. & Mustelin, T. (2004) Cell 117, 699–711. P. G. (2005) Proc. Natl. Acad. Sci. USA 102, 1927–1932. 16. Brizuela, L., Richardson, A., Marsischky, G. & LaBaer, J. (2002) Arch. Med. Res. 33, 318–324. 39. Chambers, S. P., Austen, D. A., Fulghum, J. R. & Kim, W. M. (2004) Protein 17. LaBaer, J., Qiu, Q., Anumanthan, A., Mar, W., Zuo, D., Murthy, T. V., Expr. Purif. 36, 40–47. Taycher, H., Halleck, A., Hainsworth, E., Lory, S., et al. (2004) Genome Res. 40. Stout, T. J., Foster, P. G. & Matthews, D. J. (2004) Curr. Pharm. Des. 10, 14, 2190–2200. 1069–1082. 18. Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., 41. Zhan, K., Vattem, K. M., Bauer, B. N., Dever, T. E., Chen, J. J. & Wek, R. C. Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., et al. (2000) Nat. Genet. (2002) Mol. Cell Biol. 22, 7134–7146. 25, 25–29. 42. Hutti, J. E., Jarrell, E. T., Chang, J. D., Abbott, D. W., Storz, P., Toker, A., 19. Camon, E., Magrane, M., Barrell, D., Binns, D., Fleischmann, W., Kersey, P., Cantley, L. C. & Turk, B. E. (2004) Nat. Methods 1, 27–29. Mulder, N., Oinn, T., Maslen, J., Cox, A., et al. (2003) Genome Res. 13, 662–672. 43. Paddison, P. J., Silva, J. M., Conklin, D. S., Schlabach, M., Li, M., Aruleba, S., 20. Imanishi, T., Itoh, T., Suzuki, Y., O’Donovan, C., Fukuchi, S., Koyanagi, K. O., Balija, V., O’Shaughnessy, A., Gnoj, L., Scobie, K., et al. (2004) Nature 428, Barrero, R. A., Tamura, T., Yamaguchi-Kabata, Y., Tanino, M, et al. (2004) 427–431. PLoS Biol. 2, 856–875. 44. Michiels, F., van Es, H., van Rompaey, L., Merchiers, P., Francken, B., Pittois, 21. Bork, P., Sander, C. & Valencia, A. (1993) Protein Sci. 2, 31–40. K., van der Schueren, J., Brys, R., Vandersmissen, J., Beirinckx, F., et al. (2002) 22. Varmus, H., Hirai, H., Morgan, D., Kaplan J. & Bishop, J. M. (1989) Proc. Int. Nat. Biotechnol. 20, 1154–1157. Symp. Princess Takamatsu Cancer Res. Fund 20, 63–70. 45. Rual, J. F., Hirozane-Kishikawa, T., Hao, T., Bertin, N., Li, S., Dricot, A., Li, 23. Kawakami, Y., Yao, L., Han, W. & Kawakami, T. (1996) Immunol. Lett. 54, N., Rosenberg, J., Lamesch, P., Vidalain, P. O., et al. (2004) Genome Res. 14, 113–117. 2128–2135.

Park et al. PNAS ͉ June 7, 2005 ͉ vol. 102 ͉ no. 23 ͉ 8119 Downloaded by guest on September 27, 2021