Building a Human Kinase Gene Repository: Bioinformatics, Molecular Cloning, and Functional Validation

Building a Human Kinase Gene Repository: Bioinformatics, Molecular Cloning, and Functional Validation

Building a human kinase gene repository: Bioinformatics, molecular cloning, and functional validation Jaehong Park*, Yanhui Hu*, T. V. S. Murthy*, Fredrik Vannberg*, Binghua Shen*, Andreas Rolfs*, Jessica E. Hutti†, Lewis C. Cantley†, Joshua LaBaer*, Ed Harlow*, and Leonardo Brizuela*‡ *Harvard Institute of Proteomics, Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, 320 Charles Street, Cambridge, MA 02141; and †Department of Systems Biology, Harvard Medical School, 330 Brookline Avenue, Boston, MA 02115 Contributed by Ed Harlow, April 21, 2005 Kinases catalyze the phosphorylation of proteins, lipids, sugars, creation and use of state-of-the-art clone collections that exploit the nucleosides, and other important cellular metabolites and play key newly obtained genome sequence and gene annotation. In the most regulatory roles in all aspects of eukaryotic cell physiology. Here, useful collections, clones would represent fully sequenced-verified we describe the mining of public databases to collect the sequence ORFs, make use of recombination-based cloning techniques, and information of all identified human kinase genes and the cloning be arrayed in high-density formats where all positions are fully of the corresponding ORFs. We identified 663 genes, 511 encoding annotated (16, 17). All these properties will allow high-throughput protein kinases, and 152 encoding nonprotein kinases. We describe (HT) subcloning of the genes in these collections, as well as the successful cloning and sequence verification of 270 of these facilitate experimentation (in any in vivo and in vitro system) and genes. Subcloning of this gene set in mammalian expression data collection͞analysis (both positive and negative data). vectors and their use in high-throughput cell-based screens al- In this study, we describe the construction and proof of principle lowed the validation of the clones at the level of expression and use of such a collection for the human kinase genes. We describe the identification of previously uncharacterized modulators of the the mining of public databases to identify all annotated human survivin promoter. Moreover, expressions of the kinase genes in kinases (including protein and nonprotein kinases) and the gener- bacteria, followed by autophosphorylation assays, identified 21 ation of a sequenced verified clone collection for this gene set by protein kinases that showed autocatalytic activity. The work de- using the CREATOR (BD Biosciences Clontech) cloning platform. scribed here will facilitate the functional assaying of this important We furthermore validated the expression of these clones, success- gene family in phenotypic screens and their use in biochemical and fully screened their activity en masse in three independent cell- structural studies. based assays, and confirmed enzymatic activity for some of those proteins in Escherichia coli. As we demonstrate here, this human kinome ͉ autophosphorylation ͉ cell-based screens ͉ high-throughput kinase clone set will facilitate the study of this important gene class cloning ͉ survivin both in in vivo and in vitro settings. he term kinase refers to a large number of mechanistically, Materials and Methods Tstructurally, and evolutionary distinct classes of enzymes. They Database Mining. To assemble the kinase list, LocusLink informa- catalyze the transfer of the ␥-phosphate from nucleoside triphos- tion was downloaded from the National Center for Biotechnology phates to a large number of molecules, including proteins, sugars, Information web site. Structural Query Language (SQL) was de- nucleosides, and lipids, and affect the activity and fate of those signed to query genes with proper GO͞EXTANNOT annotation, CDD molecules and the cell. annotation, or proper nomenclature. See Supporting Materials and Phosphorylation is a common posttranslational modification of Methods, which is published as supporting information on the proteins and plays a key role on protein structure and function and PNAS web site. in all aspects of cell physiology. Protein kinases contain well conserved motifs and constitute the largest family of proteins in the Molecular Cloning. PCR amplification and cloning was carried out human genome (1–3). Mutations of protein kinases are involved in by using a highly automated and laboratory information manage- carcinogenesis and several other pathological conditions (4–6). ment system (LIMS)-supported pipeline by using CREATOR recom- Phosphorylations of other biomolecules also play a critical role in bination-based cloning technologies (BD Biosciences Clontech) the physiology and pathology of cells. Lipid kinases such as the (see Supporting Materials and Methods). phosphoinositide-3 kinase family members are key modulators of the cellular response to growth factors, hormones, and neurotrans- Generating Expression-Ready Libraries. ORFs were subcloned from mitters and are involved in cancer (7). Nucleotide and nucleoside the pDNR-Dual master vector into mammalian or bacterial ex- kinases regulate the intracellular levels of phosphate donors and pression vectors. For mammalian expression, pLP-CMVneo, pLP- nucleic acid precursors and are involved in the cellular response to EGFP-C1, and pLPS-3ЈEGFP vectors (BD Biosciences Clontech) damage and ischemia (8, 9). Sugar kinases regulate the rates of were chosen for native, N-, and C-terminal EGFP-tagged version sugar metabolism, energy generation, and transcription activation for each kinase, respectively. For bacterial expression, pGEX2tk and are involved in the process of cellular transformation and (Amersham Pharmacia Biotech) was adapted for recombinational apoptosis (10–12). cloning (see Supporting Materials and Methods). The near completion of the Human Genome Project, the ongo- ing annotation projects, and the availability of sequence databases has allowed the genome-scale search and identification of members Abbreviations: HT, high throughput; MGC, Mammalian Gene Collection; shRNA, short- of different gene families by using sequence information as well as hairpin RNA; TCF, T cell factor. structural or functional annotations (2, 3, 13–15). However, a Data Deposition: The gene constructs reported in this paper have been deposited in the systematic cloning, sequence analysis, and functional validation GenBank database (accession nos. AY335555–AY335786). effort for any of these gene sets has been challenging. Indeed, a ‡To whom correspondence should be addressed. E-mail: [email protected]. major goal for experimental biology in this postgenomic era is the © 2005 by The National Academy of Sciences of the USA 8114–8119 ͉ PNAS ͉ June 7, 2005 ͉ vol. 102 ͉ no. 23 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0503141102 Downloaded by guest on September 27, 2021 Table 1. Classification of the human kinase genes identified and efforts. Because of the fact that the information at LocusLink cloning success rate in this study regarding gene sequences and annotations changes rapidly, we Successfully subsequently repeated the query by using 3 LocusLink updates. No. of cloned Success Our most recent version of the human kinase gene set, based Type Group genes genes rate % on the June 2004 analysis, consists of 663 genes. 511 of the genes (77%) encode for protein kinases and 152 genes (23%) for Human kinome 663 270 40.7 nonprotein kinases (Table 1). Genes encoding protein kinases Protein kinase 511 186 36.4 were further classified in groups according to the extended AGC 63 24 38.1 classification of protein kinases (3). Nonprotein kinases com- Atypical 40 13 32.5 prise 23% of all annotated human kinases and are composed of CAMK 73 30 41.1 heterogeneous groups of enzymes from the point of view of CK1 12 7 58.3 substrate specificity, gene sequence, and protein fold (ref. 21; CMGC 62 33 53.2 Gene Ontology Consortium, www.godatabase.org). Data Set 1, RGC 5 0 0.0 which is published as supporting information on the PNAS web Other 78 24 30.8 site, contains all relevant information for each of the 663 STE 45 18 40.0 identified genes. TK 90 22 24.4 Contemporaneous to our initial analysis, Kostich et al. (2) and TKL 43 15 34.9 Manning et al. (3) described the identification of 510 and 518 Nonprotein kinase 152 84 55.3 human genes encoding for protein kinases by using sequence alignments and pair-wise comparisons. Comparison of the data from Manning et al. with our June 2004 search results for the Mammalian Expression and Cell-Based Screens. Cotransfections of protein kinase subset indicates that our current list is missing expression clones and reporter constructs were done by using eight of the genes identified by Manning et al. Seven of these FuGene6 (Roche Molecular Biochemicals) in a 96-well format. genes (SK573, SK581, SK592, SK650, SK681, SK707, and SK723) Reporter activity was measured by using luciferase reporter were associated with LocusLink records that had been retired in assay and Great EscAPe SEAP detection kits (BD Biosciences the June 2004 update. Lastly, SK200 did not have a correspond- Clontech) (see Supporting Materials and Methods). ing full-length GenBank record at the time. The high degree of coverage obtained with these three studies suggests a general Bacterial Expression and Autophosphorylation. For more informa- agreement on the composition of the human protein kinase gene tion on bacterial expression and autophosphorylation, see Sup- family, at least based on available information. porting Materials and Methods. Cloning into Recombinational Plasmid Vectors. We have developed Results and

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us