CORE Metadata, citation and similar papers at core.ac.uk

Provided by Elsevier - Publisher Connector

Virology 362 (2007) 1–5 www.elsevier.com/locate/yviro

Rapid Communication Insertion of an HERV(K) LTR in the of NBPF3 is not required for its transcriptional activity ⁎ Karl Vandepoele, Frans van Roy

Department for Molecular Biomedical Research, VIB, B-9052 Ghent, Belgium Department of Molecular Biology, Ghent University, B-9052 Ghent, Belgium Received 7 December 2006; returned to author for revision 8 January 2007; accepted 21 January 2007 Available online 28 March 2007

Abstract

The NBPF are members of a recently described family that has an intricate genomic organization. These genes can be subdivided into two subfamilies based on the presence of an intronic HERV(K) LTR insertion in the 5′ region. A recent report describes a functional implication for this insertion, claiming that only NBPF genes with this insertion are transcriptionally active [Illarionova, A.E., Vinogradova, T.V., Sverdlov, E.D., 2007. Only those genes of the KIAA1245 gene subfamily that contain HERV(K) LTRs in their are transcriptionally active. Virology, 358 (1): 39–47]. Here, we show that an NBPF gene lacking this insertion, NBPF3, is expressed in a variety of tissues. Thus the effect of HERV(K) LTR insertion on NBPF gene expression remains unknown. © 2007 Elsevier Inc. All rights reserved.

Keywords: endogenous retrovirus; LTR; Gene families; Regulation of gene expression; ; Evolution

Introduction estimated that 5.3–8.3% of the human genome consists of these kinds of transposable elements (IHGSC, 2001; Venter et al., The NBPF gene family consists of about 22 genes with high 2001). Insertion of these elements near a gene can affect its sequence similarity. They are located primarily on regions of expression (Buzdin et al., 2006). It was reported that insertion of segmental duplications on 1 and have undergone this LTR is involved in transcriptional activation of NBPF recent partial and complete gene duplications (Vandepoele et al., genes, and that genes lacking this insertion are not expressed 2005). Additionally, several reports have shown that the copy (Illarionova et al., 2007). Here, we show that NBPF3, located on number of these genes is variable in (Redon et al., 2006; chromosome 1p36, lacks this insertion but is nevertheless Tuzun et al., 2005). The encoded by these genes are expressed in a variety of tissues. composed of repetitive elements for which no functional role has been described so far. Recently, two subfamilies were identified Results among the NBPF genes, only one of which has an intronic HERV(K) LTR insertion in the 5′ region (Illarionova et al., In a study of the NBPF genes, Illarionova et al. (2007) 2007). This HERV(K) LTR sequence is derived from the long considered the cDNA clone KIAA1245, a NBPF11 transcript, as terminal repeat of a human endogenous retrovirus of the K a prototypical NBPF transcript. Upon comparing the KIAA1245 family and contains a set of regulatory elements necessary for clone to genomic sequences, these authors identified a number transcription of the retroviral RNA (Sverdlov, 1998). It is of related sequences corresponding to different members of the recently described NBPF family (Vandepoele et al., 2005). An HERV(K) LTR insertion was identified in an intron in the 5′ ⁎ Corresponding author. Department for Molecular Biomedical Research, VIB-Ghent University, Technologiepark 927, B-9052 Zwijnaarde, Belgium. region of a number of these paralogs, and it was claimed that Fax: +32 9 33 13 609. only the NBPF genes with this insertion are transcriptionally E-mail address: [email protected] (F. van Roy). active (Illarionova et al., 2007).

0042-6822/$ - see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.virol.2007.01.044 2 Rapid Communication

The genes lacking the LTR had been designated as junctions; intronic sequence are in italics; mutated nucleotides AL592309 and AL356957 (Illarionova et al., 2007), in are underlined). These mutations might interfere with the correspondence to the GenBank accession numbers of the retention of these sequences as in the mature NBPF3 BAC clones encompassing the different genomic fragments. transcript. Based on our previous genomic analysis of the NBPF genes For the NBPF17P gene, although we identified a single (Vandepoele et al., 2005), we annotated these genes as NBPF3 DNA clone as a transcript from this gene (GenBank accession and NBPF17P, respectively. Similarly, NBPF genes containing no. BE818279), we classified it as a pseudogene due to the the LTR insertion had been designated AL049742, AL954711 presence of many nonsense mutations in the coding sequence and AL592284 (Illarionova et al., 2007), and we annotated them (Vandepoele et al., 2005). as NBPF10, NBPF11 and NBPF15.RT–PCR experiments Apart from the intronic insertion of the HERV(K) LTR in the performed with different primer sets led Illarionova et al. (2007) 5′ region, the NBPF3 and NBPF17P genes differ from the other to conclude that only NBPF10, NBPF11 and NBPF15 were NBPF genes in the length of the intron between the coding expressed; the absence of RT–PCR products for NBPF3 and exons of types 7 and 8. For these two genes, the intron is NBPF17P led to the conclusion that these genes were not 3.2 kbp, whereas for other genes (NBPF1, NBPF9-11, expressed. NBPF14-16 and NBPF20) it is approximately 3.9 kbp. As We used the 5′ untranslated region (UTR) of the NBPF11 described earlier, the majority of the NBPF genes have a LINE transcript, as present in the KIAA1245 clone (nucleotides 1– repeat (Long INterspersed Element; L1P2) inserted in this 770 from GenBank accession no. AB033071), in a BLATsearch intron (Vandepoele et al., 2005). The longer intron in the second of the human genome (NCBI build 36.1). Many hits were group is due to the additional insertion of an L1PA4 repeat into obtained, corresponding to the regions previously described as the first LINE repeat. The presence or absence of the HERV(K) harbouring NBPF genes (Vandepoele et al., 2005). The result LTR and the L1PA4 LINE repeat clearly distinguishes the for the NBPF3 gene is shown in Fig. 1, which also shows the NBPF3 and NBPF17P genes from the other NBPF genes and sequence of the NBPF11 transcript, annotated as KIAA1245, shows that these genes are at the evolutionary cradle of the and the regions that are homologous to the genomic of NBPF gene family. NBPF3 (blocks connected by lines). The RepeatMasker track at the bottom of the figure clearly confirms that there is no LTR Discussion sequence in the genomic sequence flanked by the primers used by Illarionova et al. (2007). As expected, the genomic sequence The members of the NBPF gene family are located primarily surrounding the NBPF3 transcription initiation site does contain on segmental duplications of and have been repetitive elements, including LTRs, but we could not detect an subjected to several partial and complete gene duplications in HERV(K) LTR in this region. recent evolution (Vandepoele et al., 2005). This evolution is still In this analysis, we also observed numerous cDNA clones going on, as both copy number variation and structural variation and ESTs annotated to this gene. These results corroborate our between different individuals have been described for some previous expression analysis of the NBPF genes (Vandepoele NBPF genes (Redon et al., 2006; Tuzun et al., 2005). et al., 2005), in which we annotated many cDNAs to the NBPF3 It has been shown that after gene duplication, regulatory gene. The NBPF3 transcripts shown in Fig. 1 were derived from regions can diverge more rapidly than coding sequences, a several tissues, both adult and embryonic, and cell lines phenomenon known as the duplication–degeneration–comple- including testis (17), brain (13) and embryonic stem cells (5). mentation model (Force et al., 1999). Apart from nucleotide This demonstrates that NBPF3 is expressed in different tissues. mutations, the insertion of retroviral elements can also influence Remarkably, none of the NBPF3 transcripts contain the the transcriptional activities of the affected gene (Peaston et al., regions homologous to the second and third NBPF11 exons, as 2004; van de Lagemaat et al., 2003). During the evolution of the present in the KIAA1245 sequence. We compared the genomic NBPF genes, an HERV(K) LTR insertion occurred in the 5′ NBPF3 sequences with the homologous –intron bound- region of one of the duplicated genes. Further duplications gave aries of the NBPF11 gene, and found mutations in the NBPF3 rise to the two recently described NBPF subfamilies (Illarionova donor splice site for both ‘exons’ containing the primer-binding et al., 2007). The limited number of nucleotide substitutions in sites used by Illarionova et al. (2007). For exon 2, containing the different NBPF paralogs impedes a reliable phylogenetic primer-1 binding site, the TTT⁎GTAAGT sequence in the analysis, but the analysis can be significantly improved by NBPF11 gene is mutated to TTT⁎GTAAAT in NBPF3. For exon considering additional data. Here, we show that the NBPF3 and 3, containing the primer-3 and -4 binding sites, the AAG⁎GT- NBPF17P genes, located on 1p36 and 1q21, GAGG sequence in the NBPF11 gene is mutated to AAG⁎TT- respectively, originated quite early during the evolution of the GAGG in NBPF3 (asterisks denote real or putative exon–intron NBPF gene family, as they both lack the insertion of an

Fig. 1. Overview of the NBPF3 locus. The 5′-UTR of KIAA1245 (nucleotides 1–770 of AB033071) was used in a BLAT search against the human genome, in combination with the forward (Pr1) and reverse (Pr3 and Pr4) primers used by Illarionova et al. (2007). The 5′-UTR of KIAA1245 shows homology to the genomic region of NBPF3, illustrated as blocks interconnected by lines, but none of the numerous transcripts from this gene contains the fragments to which the primers bind. The RepeatMasker track at the bottom clearly shows the absence of HERV(K) LTR in the genomic region flanked by Pr1 and Pr3. None of the repeats shown in the NBPF3 locus are HERV(K) LTR sequences. Rapid Communication 3 4 Rapid Communication

HERV(K) LTR in their 5′ region (Illarionova et al., 2007) and in vivo as active promoters for host nonrepetitive DNA transcription. – the insertion of an L1PA4 repeat in the middle of the gene. J. Virol. 80 (21), 10752 10762. Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., Postlethwait, J., Although the LTR sequence was shown to enhance the 1999. Preservation of duplicate genes by complementary, degenerative transcription of the SV40 promoter (Illarionova et al., 2007), mutations. Genetics 151 (4), 1531–1545. we show here that presence of the HERV(K) LTR is not required Hughes, T.A., 2006. Regulation of gene expression by alternative untranslated for expression of the NBPF genes. Our in silico analysis, regions. Trends Genet. 22, 119–122. combined with our previous experimental analysis (Vandepoele llarionova, A.E., Vinogradova, T.V., Sverdlov, E.D., 2007. Only those genes of the KIAA1245 gene subfamily that contain HERV(K) LTRs in their introns et al., 2005), shows that NBPF3 is expressed in a variety of are transcriptionally active. Virology 358 (1), 39–47. tissues, even though it lacks the HERV(K) LTR insertion. International Human Genome Sequencing Consortium (IHGSC), 2001. Initial Additionally, we show that Illarionova et al. (2007) failed to sequencing and analysis of the human genome. Nature 409 (6822), detect transcription due to a flawed choice of primers. The 860–921. regions targeted by their primers are not incorporated in the Peaston, A.E., Evsikov, A.V., Graber, J.H., de Vries, W.N., Holbrook, A.E., Solter, D., Knowles, B.B., 2004. Retrotransposons regulate host genes in mature NBPF3 transcripts, which would inevitably lead to false mouse oocytes and preimplantation embryos. Dev. Cell. 7, 597–606. negative results. Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., In most duplicated genes, one of the genes accumulates Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., Cho, E.K., Dallaire, deleterious mutations, leading to complete degeneration and S., Freeman, J.L., Gonzalez, J.R., Gratacos, M., Huang, J., Kalaitzopoulos, functional inactivation of the gene. However, in some cases, D., Komura, D., MacDonald, J.R., Marshall, C.R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M.J., Tchinda, J., subfunctionalization can occur, whereby the function of the Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Armengol, L., original gene is divided over the two daughter copies. This Conrad, D.F., Estivill, X., Tyler-Smith, C., Carter, N.P., Aburatani, H., Lee, subfunctionalization can occur either through mutations in the C., Jones, K.W., Scherer, S.W., Hurles, M.E., 2006. Global variation in coding sequence (Zhang et al., 2002) or through differential copy number in the human genome. Nature 444 (7118), 444–454. regulation of transcription or translation. Here, we show that in Sverdlov, E.D., 1998. Perpetually mobile footprints of ancient infections in human genome. FEBS Lett. 428, 1–6. spite of high nucleotide identity between different NBPF Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., paralogs, both in coding and non-coding regions, the 5′-UTR of Haugen, E., Hayden, H., Albertson, D., Pinkel, D., Olson, M.V., Eichler, E. the processed transcript can differ quite extensively through the E., 2005. Fine-scale structural variation of the human genome. Nat. Genet. use of different exons, potentially leading to differential reg- 37, 727–732. ulation (Hughes, 2006). In order to avoid false interpretations, van de Lagemaat, L.N., Landry, J.R., Mager, D.L., Medstrand, P., 2003. Transposable elements in mammals promote regulatory variation and this phenomenon should be taken into account in the expression diversification of genes with specialized functions. Trends Genet. 19, analysis of recently duplicated genes. 530–536. All together, we conclude that the statement by Illarionova Vandepoele, K., Van Roy, N., Staes, K., Speleman, F., van Roy, F., 2005. A et al. (2007) that NBPF genes lacking the HERV(K) LTR novel gene family NBPF: intricate structure generated by gene duplications – insertion are transcriptionally inactive is not correct. However, during primate evolution. Mol. Biol. Evol. 22 (11), 2265 2274. Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., we do acknowledge the differential presence of the HERV(K) Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., Gocayne, J.D., LTR insertion in different NBPF members, which should be Amanatides, P., Ballew, R.M., Huson, D.H., Wortman, J.R., Zhang, Q., analyzed more thoroughly to validate the influence of this Kodira, C.D., Zheng, X.H., Chen, L., Skupski, M., Subramanian, G., Thomas, insertion on the transcriptional activities of these genes. P.D., Zhang, J., Gabor Miklos, G.L., Nelson, C., Broder, S., Clark, A.G., Nadeau, J., McKusick, V.A., Zinder, N., Levine, A.J., Roberts, R.J., Simon, M., Slayman, C., Hunkapiller, M., Bolanos, R., Delcher, A., Dew, I., Fasulo, Materials and methods D., Flanigan, M., Florea, L., Halpern, A., Hannenhalli, S., Kravitz, S., Levy, S., Mobarry, C., Reinert, K., Remington, K., Abu-Threideh, J., Beasley, E., The NBPF3 sequences homologous to the KIAA1245 Biddick, K., Bonazzi, V., Brandon, R., Cargill, M., Chandramouliswaran, I., fragment were isolated using the Human Genome Browser at Charlab, R., Chaturvedi, K., Deng, Z., Di Francesco, V., Dunn, P., Eilbeck, UCSC (http://genome.ucsc.edu/). Repetitive elements were K., Evangelista, C., Gabrielian, A.E., Gan, W., Ge, W., Gong, F., Gu, Z., Guan, P., Heiman, T.J., Higgins, M.E., Ji, R.R., Ke, Z., Ketchum, K.A., Lai, identified with the RepeatMasker software (http://www. Z., Lei, Y., Li, Z., Li, J., Liang, Y., Lin, X., Lu, F., Merkulov, G.V., Milshina, repeatmasker.org/). N., Moore, H.M., Naik, A.K., Narayan, V.A., Neelam, B., Nusskern, D., Rusch, D.B., Salzberg, S., Shao, W., Shue, B., Sun, J., Wang, Z., Wang, A., Acknowledgments Wang, X., Wang, J., Wei, M., Wides, R., Xiao, C., Yan, C., Yao, A., Ye, J., Zhan, M., Zhang, W., Zhang, H., Zhao, Q., Zheng, L., Zhong, F., Zhong, W., Zhu, S., Zhao, S., Gilbert, D., Baumhueter, S., Spier, G., Carter, C., Karl Vandepoele was supported by the Interuniversity Cravchik, A., Woodage, T., Ali, F., An, H., Awe, A., Baldwin, D., Baden, Attraction Poles Programme of the Belgian Science Policy. H., Barnstead, M., Barrow, I., Beeson, K., Busam, D., Carver, A., Center, Present research is supported by the Foundation against A., Cheng, M.L., Curry, L., Danaher, S., Davenport, L., Desilets, R., Dietz, Cancer, Belgium. We thank Dr. Amin Bredan for editing the S., Dodson, K., Doup, L., Ferriera, S., Garg, N., Gluecksmann, A., Hart, B., manuscript. Haynes, J., Haynes, C., Heiner, C., Hladun, S., Hostin, D., Houck, J., Howland, T., Ibegwam, C., Johnson, J., Kalush, F., Kline, L., Koduru, S., Love, A., Mann, F., May, D., McCawley, S., McIntosh, T., McMullen, I., Moy, References M., Moy, L., Murphy, B., Nelson, K., Pfannkoch, C., Pratts, E., Puri, V., Qureshi, H., Reardon, M., Rodriguez, R., Rogers, Y.H., Romblad, D., Ruhfel, Buzdin, A., Kovalskaya-Alexandrova, E., Gogvadze, E., Sverdlov, E., 2006. At B., Scott, R., Sitter, C., Smallwood, M., Stewart, E., Strong, R., Suh, E., least 50% of human-specific HERV-K (HML-2) long terminal repeats serve Thomas, R., Tint, N.N., Tse, S., Vech, C., Wang, G., Wetter, J., Williams, S., Rapid Communication 5

Williams, M., Windsor, S., Winn-Deen, E., Wolfe, K., Zaveri, J., Zaveri, K., Kraft, C., Levitsky, A., Lewis, M., Liu, X., Lopez, J., Ma, D., Majoros, Abril, J.F., Guigo, R., Campbell, M.J., Sjolander, K.V., Karlak, B., W., McDaniel, J., Murphy, S., Newman, M., Nguyen, T., Nguyen, N., Kejariwal, A., Mi, H., Lazareva, B., Hatton, T., Narechania, A., Diemer, Nodell, M., Pan, S., Peck, J., Peterson, M., Rowe, W., Sanders, R., Scott, K., Muruganujan, A., Guo, N., Sato, S., Bafna, V., Istrail, S., Lippert, R., J., Simpson, M., Smith, T., Sprague, A., Stockwell, T., Turner, R., Venter, Schwartz, R., Walenz, B., Yooseph, S., Allen, D., Basu, A., Baxendale, J., E., Wang, M., Wen, M., Wu, D., Wu, M., Xia, A., Zandieh, A., Zhu, X., Blick, L., Caminha, M., Carnes-Stine, J., Caulk, P., Chiang, Y.H., Coyne, 2001. The sequence of the human genome. Science 291 (5507), M., Dahlke, C., Mays, A., Dombroski, M., Donnelly, M., Ely, D., 1304–1351. Esparham, S., Fosler, C., Gire, H., Glanowski, S., Glasser, K., Glodek, A., Zhang, J.Z., Zhang, Y.P., Rosenberg, H.F., 2002. Adaptive evolution of a Gorokhov, M., Graham, K., Gropman, B., Harris, M., Heil, J., Henderson, duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat. Genet. S., Hoover, J., Jennings, D., Jordan, C., Jordan, J., Kasha, J., Kagan, L., 30, 411–415.