Genes Genet. Syst. (2004) 79, p. 177–182 Cloning, characterization and expression of CDK5RAP1_v3 and CDK5RAP1_v4, two novel splice variants of human CDK5RAP1

Xianqiong Zou1,2, Chaoneng Ji1, Feng Jin1, Jianping Liu1, Maoqing Wu1, Huarui Zheng1, Yu Wang1, Xin Li1, Jian Xu1, Shaohua Gu1, Yi Xie1,*, Yumin Mao1,** 1State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai 200433, P. R. China 2School of Resource Processing and Bio-engineering, Central South University, Changsha 410083, P. R. China

(Received 17 December 2003, accepted 28 April 2004)

Two novel splice variants of CDK5RAP1, named CDK5RAP1_v3 and CDK5RAP1_v4, were isolated through the large-scale sequencing analysis of a human fetal brain cDNA library. The CDK5RAP1_v3 and CDK5RAP1_v4 cDNAs are 1923bp and 1792bp in length, respectively. Sequence analysis revealed that CDK5RAP1_v4 lacked 1 exon, which was present in CDK5RAP1_v3, with the result that these cDNAs encoded different putative . The deduced pro- teins were 574 amino acids (designated as CDK5RAP1_v3) and 426 amino acids (CDK5RAP1_v4) in length, and shared the 420 N-terminal amino acids. RT-PCR analysis showed that human CDK5RAP1_v3 was widely expressed in human tissues. The expression level of CDK5RAP1_v3 was relatively high in placenta and lung, whereas low levels of expression were detected in heart, brain, liver, skeletal muscle, pancreas, spleen, thymus, small intestine and peripheral blood leukocytes. In contrast, human CDK5RAP1_v4 was mainly expressed in brain, placenta and testis.

Key words: , Elp3, human CDK5RAP1, RT-PCR, TRAM domain

Two novel splice variants of CDK5RAP1, which we INTRODUCTION named CDK5RAP1_v3 and CDK5RAP1_v4, were isolated Neuronal Cdc2-like kinase (Nclk) is involved in the reg- through the large-scale sequencing analysis (for detecting ulation of neuronal differentiation and neuro-cytoskele- new full-length human ) of a human fetal brain ton dynamics. The active kinase consists of a catalytic cDNA library. We termed the original two splice vari- subunit, Cdk5, and a 25 kDa activator (p25nck5a) ants CDK5RAP1_v1 (NM_016408) and CDK5RAP1_v2 derived from a 35-kDa neuron-specific protein (p35nck5a) (NM_016082), Here we report cloning and characteriza- (Ching et al., 2000). Previous experiments suggested tion of CDK5RAP1_v3 and CDK5RAP1_v4, together with that Cdk5 exists in three distinct states: a free monomer, data of the CDK5RAP1 structure and mRNA tissue a form associated with p25nck5a and a form complexed with distribution. p35nck5a (Lee et al., 1996). CDK5RAP1, CDK1 regulatory subunit associated protein 1 (Aliases: C20orf34, C42, MATERIAL AND METHODS CGI-05, HSPC167, Loc 51654) is one of the p35nck5a-asso- ciated proteins. It was expressed in Escherichia coli cDNA library construction and DNA sequencing strain BL21 and was shown to undergo high affinity asso- A cDNA library was constructed in a modified pBluescript ciation with both the free and the Cdk5 complex forms of II SK (+) vector with human fetal brain mRNA purchased p35nck5a (Ching et al., 2000). from Clontech. A 0.5-kb DNA fragment containing SfiA (5’-GGCCATTATGGCC-3’) and SfiB (5’-GGCCGCCTCG- Edited by Tatsuro Ikeuchi GCC-3’) recognition sites was introduced into the EcoR * Corresponding author. E-mail: [email protected] and Not sites of pBluescript II SK (+) (Stratagene); the ** Corresponding author. E-mail: [email protected] modified vector was then digested with Sfi and the large

178 X. ZOU et al. fragment was excised and purified for library construc- SfiB sites of the modified vector and the resultant vector tion. After SfiA disgestion and cDNA size fractionation, was transformed into DH5 α Escherichia coli strain. A cDNAs longer than 500bp were ligated into the SfiA and 96-well R.E.A.L plasmid ki (Qiagen) was used to synthe-

Fig. 1 The cDNA and deduced amino acid sequences of CDK5RAP1_v3 and CDK5RAP1_v4. The nucleotides are num- bered at the left. The in-frame stop codon (tag) is shown in bold; the asterisks denote stop codons; three possible polyade- nylation signals (aataaa or attaaa) are boxed. The potential signal peptide is shown in bold; the conserved Elp3 domain is underlined; the radical_SAM domain is underlined and shaded; and the TRAM domain is shaded.

Characterization of two novel splice variants of human CDK5RAP1 179 size the double-stranded plasmid. Sequencing reactions RAP1_v4 was 5’- ACCAACACTCATTTCACAGTGCAAG- were performed using BigDye Terminator Cycle Sequenc- 3’. G3PDH-specific primers were 5’-TGAAGGTCGGAG- ing Kit and BigDye Primer Cycle Sequencing Kit (Perkin- TCAACGGATTTGGT-3’ (G3PDHF) and 5’-CATGTGGGC- Elmer). The complete sequence was determined and CATGAGGTCCACCAC-3’ (G3PDHR). Twenty-four cycles confirmed by primer walking strategy. Subsequent edit- (for G3PDH) or 36 cycles (for CDK5RAP1_v3 and CDK5- ing and assembly of all the sequences from one clone was RAP1_v4) of amplification (30 s at 94°C, 45 s at 67°C and performed using Acembly (Sanger’s Center). In total, 1.0 min at 72°C) were performed using Taq plus DNA approximately 20000 human cDNAs were sequenced dur- polymerase (Sangon). The PCR products of CDK5- ing 1999–2001. RAP1_v3, CDK5RAP1_v4 and G3PDH were then electro- phoresed on a 1.5% agarose gel. The products generated Bioinformatics analysis DNA and protein sequence from human CDK5RAP1_v3 and CDK5RAP1_v4 were comparisons were carried out using BLAST2.0 at NCBI 621bp and 489bp, respectively. (http://www.ncbi.nlm.nih.gov/blast). Motif analysis was done with RPS-BLAST searching, PROFILESCAN (http:/ RESULTS /www.isrec.isb-sib.ch/software/PFSCAN_form.html) and SMART searching (http://smart.embl-heidelberg.de/) were The CDK5RAP1_v3 and CDK5RAP1_v4 cDNA also used. Moreover, relative analysis software programs sequences During the large-scale sequencing analysis include GCG (Wisconsin Package, Version 10.0), Gene of a human fetal brain cDNA library constructed in our Runner (http://www.generunner.com/), and GeneDoc laboratory, we isolated two novel splice variants of the (http://www.psc.edu/biomed/genedoc/) were used. human CDK5RAP1 gene, named CDK5RAP1_v3 and CDK5RAP1_v4. We submitted the cDNA sequences of RT-PCR Two human Multiple Tissue cDNA (MTC) CDK5RAP1_v3 and CDK5RAP1_v4 to Genbank under the panels (CLONTECH) were used as PCR templates accession numbers AY462283 and AY462284. according to the manufacturer’s protocol. The sense primer sequence for CDK5RAP1_v3 and CDK5RAP1_v4 Sequence characterization The CDK5RAP1_v3 and was 5’- GCAGCTGATTCATGAGAGAGATAAC -3’, and CDK5RAP1_v4 cDNAs are 1923bp and 1792bp in length, the antisense primer for CDK5RAP1_v3 and CDK5- respectively (Fig. 1). The BLAST analysis of these two

Fig. 2 Genome structure and alternative splicing of human CDK5RAP1 was predicted by alignment of cDNAs with genomic DNA. P1 and P2 show the positions of primers for RT-PCR. The start codon ATG and stop codon TGA/TAG are marked. The same exon is shown by the same letter in various cDNAs; if the length of the exon is different, “2” is added after the letter to discriminate between the 2 types of exons; exon b2 is the right part of exon b; exon m2 is the left part of exon m. Gray shading of a box indicates a common sequence. The black shading indicates the possible gap of nucleotides 975-1016 of AF152097 cDNA. The accession num- bers of these splice variants are CDK5RAP1_v3 (AY462283), CDK5RAP1_v4 (AY462284), CDK5RAP1_v1 (NM_016408), CDK5RAP1_v2 (NM_016082), AF152097 and BC001215.

180 X. ZOU et al. sequences revealed that CDK5RAP1_v4 lacked 1 exon upstream in-frame stop codon (tga) was found at position (exon k) that was present in CDK5RAP1_v3, and as a 27–29, and one possible polyadenylation signal (AATAAA) result these cDNAs encoded different putative proteins in CDK5RAP1_v3 and two possible polyadenylation sig- (Fig. 2). The deduced proteins were 574 amino acids nals (AATAAA) in CDK5RAP1_v4 were found after the (CDK5RAP1_v3) and 426 amino acids (CDK5RAP1_v4) in stop codons utilized by these mRNAs (Fig. 1). In addi- length, and shared the 420 N-terminal amino acids. An tion, a potential signal peptide was found at residues 1 to

Fig. 3 Alignment of human CDK5RAP1_v3, CDK5RAP1_v4 and other CDK5RAP1 members. The aligned protein sequences are registered under Genbank accession numbers: CDK5RAP1_v3 (AY462283), CDK5RAP1_v4 (AY462284), CDK5RAP1_v1 (NP_057492), CDK5RAP1_v2 (NP_057166), AAD34147.1 (AAD34147.1 is encoded by AF152097 cDNA) and AAH01215.1 (AAH01215.1 is encoded by BC001215 cDNA). The alignment was performed using GeneDoc program (http://www.psc.edu/ biomed/genedoc/). Residues with dark shading are identical amino acids. Gray shading indicates 80–90% similarity and light gray indicates 60–70% similarity.

Characterization of two novel splice variants of human CDK5RAP1 181

750bp- CDK5RAP1_v3 500bp- CDK5RAP1_v4

G3PDH

Fig. 4 Expression patterns of CDK5RAP1_v3 and CDK5RAP1_v4 in 16 human tissues. RT-PCR analysis of CDK5RAP1_v3, CDK5RAP1_v4 and G3PDH (as a control) was performed as described in the text.

33 of both CDK5RAP1_v3 and CDK5RAP1_v4 (Fig. human CDK5RAP1_v4 was mainly expressed in brain, 1). RPS-BLAST and SMART searching revealed that placenta and testis (Fig. 4). CDK5RAP1_v3 had an Elp3 domain encompassing resi- dues 248 to 487, a radical_SAM domain (residues 252 to DISCUSSION 447), and a TRAM domain (residues 501 to 574). In con- trast, CDK5RAP1_v4 had a partial Elp3 domain encom- Here we report two novel splice variants of the human passing residues 248 to 426, and a radical_SAM domain CDK5RAP1 gene named CDK5RAP1_v3 and CDK5- (residues 252 to 424). Moreover, an Elp3 domain and a RAP1_v4. The alternative splicing caused the putative radical_SAM domain were found in each of CDK5- Elp3 domain and TRAM domain of CDK5RAP1_v3 and RAP1_v1, CDK5RAP1_v2, AAD34147.1 and AAH01215; CDK5RAP1_v4 to be different from each other. Elp3 has A TRAM domain was found in these proteins except been isolated as part of a larger six-subunit complex spe- AAD34147.1. Alignment of human CDK5RAP1_v3, cifically associated with elongating RNA polymerase II CDK5RAP1_v4 and other CDK5RAP1 members was per- (Winkler et al., 2001). It was shown to have histone formed, and revealed similarities and identities among acetyltransferase (HAT) activity in vitro, and was shown these proteins (Fig. 3). genetically to interact with histone tails and other chro- To determine the locations of the CDK5- matin-modifying factors (Wittschieben et al., 2000), sug- RAP1_v3 and CDK5RAP1_v4, the cDNA sequences were gesting that it enhances transcription by modifying chro- searched against the database with the matin structure, which might ease progression of the BLAST program (http://www.ncbi.nlm.nih.gov/BLAST). polymerase through nucleosomes (Wittschieben et al., The gene was thereby mapped to contig NT_028392.4 1999). In vivo, ELP3 gene deletion confers phenotypes from chromosome 20q11.2. Moreover, human CDK5- such as slow growth adaptation, slow gene activation, and RAP1_v3 was found to consist of 13 exons and CDK5- temperature sensitivity (Wittschieben et al., 1999). In RAP1_v4 to consist of 12 exons spanning about 41.3kb. addition to HAT activity, Elp3-related proteins might pos- sess a second enzyme activity (histone demethylase activ- Expression patterns of CDK5RAP1_v3 and CDK5- ity) (Chinenov, 2002). Furthermore, the Elp3 superfam- RAP1_v4 In order to investigate the tissue expression ily includes the MiaB family (Esberg et al., 1999) and patterns of the two novel splice variants, we performed coproporphyrinogen III oxidase (Lash et al., 2002). All RT-PCR. The sense and antisense primer sequences for members of the Elp3 superfamily contain a region similar CDK5RAP1_v3 and CDK5RAP1_v4 were located in exon to the catalytic domain of the radical SAM family i and exon m, respectively (Fig. 2). The RT-PCR results (Chinenov, 2002). Radical SAM proteins catalyze showed that human CDK5RAP1_v3 was widely expressed diverse reactions, including unusual methylations, iso- in human tissues. The expression level of CDK5- merization, sulfur insertion, ring formation, anaerobic RAP1_v3 was relatively high in placenta and lung; low oxidation and protein radical formation. They function levels of expression were detected in heart, brain, liver, in DNA precursor, vitamin, cofactor, antibiotic and herbi- skeletal muscle, pancreas, spleen, thymus, small intes- cide biosynthesis and in biodegradation pathways (Sofia tine and peripheral blood leukocytes. In contrast, et al., 2001). Evidence exists that these proteins gener- 182 X. ZOU et al. ate a radical species by reductive cleavage of S:-adenosyl- (assembled by comparative proteomic approach) and methionine (SAM) through an unusual Fe-S center (Sofia BC001215, respectively (data not shown). The different et al., 2001). The TRAM domain is found in two distinct expression patterns of the splice variants and comparison classes of tRNA-modifying enzymes, namely uridine of the domains in putative proteins imply that these vari- methylases of the TRM2 family and enzymes of the MiaB ants might play different roles in certain tissues. Fur- family, which are involved in 2-methylthioadenine forma- ther studies will be necessary to define the precise roles tion (Anantharaman et al., 2001). This domain is pre- of splice variants of CDK5RAP1. dicted to bind tRNA and deliver the RNA-modifying enzy- matic domains to their targets. In addition, the TRAM This research was a part of Project 30070383 supported by the domain is present in several other proteins associated National Natural Science Foundation of China. with the translation machinery and in a family of small, uncharacterized archaeal proteins that are predicted to play a role in the regulation of tRNA modification or REFERENCES translation (Anantharaman et al., 2001). CDK5RAP1_v3 Anantharaman, V., Koonin, E. V., and Aravind, L. (2001) has an integrated Elp3 domain, a radical_SAM domain TRAM, a predicted RNA-binding domain, common to tRNA and a TRAM domain. In contrast, as result of alterna- uracil methylation and adenine thiolation enzymes. FEMS tive splicing a partial Elp3 domain and an intergrated Microbiol Lett 197, 215–221. Chinenov Y (2002) A second catalytic domain in the Elp3 histone radical_SAM domain were found in CDK5RAP1_v4, but acetyltransferases, a candidate for histone demethylase no TRAM domain was present, and this may affect the activity? Trends in Biochem Sci 27, 115–117. enzymatic activity and the ability of CDK5RAP1_v4 to Ching, Y. P., Qi, Z., and Wang, J. H. (2000) Cloning of three bind and modify tRNA. novel neuronal Cdk5 activator binding proteins. Gene 242, Moreover, AF152097 cDNA (CGI-05) was assembled by 285–294. Chunhung, L., Changyuan, C., Lanyang, C., Chungshyan, L., a comparative proteomic approach (Comparative gene and Wenchang, L. (2000) Identification of novel human identification, CGI) (Chunhung et al., 2000). Through genes evolutionarily conserved in caenorhabditis elegans by careful bioinformatics analyses, we found that nucle- comparative proteomics, Genome Res 10, 703–713. otides 975-1016 of AF152097 cDNA do not exist in human Esberg, B., Leung, H. C., Tsui, H. C., Bjork, G. R., and Winkler, genomic contig NT_028392.4 from (Fig. M. E. (1999) Identification of the miaB gene, involved in methylthiolation of isopentenylated A37 derivatives in the 3). However, nucleotides 975-1016 of AF152097 cDNA tRNA of Salmonella typhimurium and Escherichia coli.J could be found in the human DNA sequence from clone Bacteriol 181, 7256–7265. RP5-1187J4 on chromosome 20q11.1-11.23 (GenBank acc. Lash, T. D., Kaprak, T. A., Shen, L., and Jones, M. A. (2002) NO: AL355392.7), so AF152097 cDNA may also be a Metabolism of analogues of coproporphyrinogen-III with splice variant of CDK5RAP1, which would mean that modified side chains, implications for binding at the active site of coproporphyrinogen oxidase. Bioorg Med Chem Lett genomic contig NT_028392.4 from chromosome 20 is still 12, 451–456. incomplete. On the other hand, because no human EST Lee, K. Y., Rosales, J. L., Tang, D., and Wang, J. H. (1996) Inter- data support the notion that this region is expressed and action of cyclin-dependent kinase 5 (Cdk5) and neuronal no region homologous to nucleotides 975-1016 of Cdk5 activator in bovine brain. J. Biol. Chem 271, 1538– AF152097 cDNA could be found in the mouse DNA 1543. Sofia, H. J., Chen, G., Hetzler, B. G., Reyes-Spindola, J. F., and sequence from clone RP23-154J12 on chromosome 2 (Gen- Miller, N. E. (2001) Radical SAM, a novel protein superfam- Bank acc. NO: AL732601.16), nucleotides 975-1016 of ily linking unresolved steps in familiar biosynthetic path- AF152097 cDNA may be a gap as well in contrast to ways with radical mechanisms, functional characterization human genomic contig NT_028392.4 from chromosome 20 using new analysis and information visualization methods. (Fig. 2). Nucleic Acids Res 29, 1097–1106. Winkler, G. S., Petrakis, T. G., Ethelberg, S., Tokunaga, M., Although both CDK5RAP1_v3 and CDK5RAP1_v4 were Eridjument-Bromage, H., Tempst, P., and Svejstrup, J. Q. expressed in brain and placenta, the abundance of (2001) RNA polymerase II elongator holoenzyme is com- CDK5RAP1_v3 in brain and placenta was distinctly posed of two discrete subcomplexes. J. Biol. Chem 276, higher than that of CDK5RAP1_v4. In contrast, 32743–32749 CDK5RAP1_v4 could be detected whereas CDK5RAP1_v3 Wittschieben, B. O., Otero, G., de Bizemont, T., Fellows, J., Erd- jument-Bromage, H., Ohba, R., Li,Y., Allis, C. D., Tempst, could not in testis. In addition, BLAST N Searching in P., and Svejstrup, J. Q. (1999) A novel histone acetyltrans- the non-redundant and human ESTs databases of NCBI ferase is an integral subunit of elongating RNA polymerase also roughly revealed the number of clones obtained for II holoenzyme. Mol Cell 4, 123–128. each splice variant, which could be taken to indicate the Wittschieben, B. O., Fellows, J., Du, W., Stillman, D. J., and relative abundance. These relative abundances were Svejstrup, J. Q. (2000) Overlapping roles for the histone acetyltransferase activities of SAGA and Elongator in vivo. 82%, 6%, 6%, 2%, 0%, 2% for CDK5RAP1_v1, CDK5- EMBO J 19, 3060–3068. RAP1_v2, CDK5RAP1_v3, CDK5RAP1_v4, AF152097