Genomic Architecture and Transcriptional Activation Of
Total Page:16
File Type:pdf, Size:1020Kb
Oncogene (1998) 17, 2761 ± 2770 ã 1998 Stockton Press All rights reserved 0950 ± 9232/98 $12.00 http://www.stockton-press.co.uk/onc Genomic architecture and transcriptional activation of the mouse and human tumor susceptibility gene TSG101: Common types of shorter transcripts are true alternative splice variants Kay-Uwe Wagner1, Patricia Dierisseau1, Edmund B Rucker III1, Gertraud W Robinson1 and Lothar Hennighausen*,1 1Laboratory of Genetics and Physiology, National Institute of Diabetes, Digestive, and Kidney Diseases, National Institutes of Health, Building 8, Room 101, Bethesda, Maryland 20892-0822, USA The functional inactivation of the tumor susceptibility both encode predicted proteins of about 43 kDa in size gene tsg101 in mouse NIH3T3 cells leads to cell that show 94% similarity. On the subcellular level, the transformation and the formation of metastatic tumors TSG101 protein is localized mainly in the cytoplasm in nude mice. We cloned, mapped and sequenced the (Zhong et al., 1997) but depending on the stage of the mouse tsg101 gene and further identi®ed a processed cell cycle it can be detected in the nucleus and it can pseudogene that is 98% identical to the tsg101 cDNA. co-localize with the mitotic spindle apparatus during Based on Northern blot analysis, tsg101 is expressed cell division (Xie et al., 1998). Another indication that ubiquitously in mouse tissues. A comparison of the TSG101 may have dierent functions at speci®c stages coding region of the mouse tsg101 gene with the human of the cell cycle is the presence of highly conserved TSG101 cDNA revealed that both the mouse and human motifs, such as a coiled-coil domain that is proposed to gene encode ten additional highly conserved amino acids interact with stathmin (Maucuer et al., 1995) and a at the N-terminus. Based on the mouse tsg101 genomic proline-rich region known to exist in activation structure, we predicted four additional introns within the domains of transcription factors (Li et al., 1997). human TSG101 gene. Their location was con®rmed using Furthermore, an N-terminal region is similar to the PCR and sequencing analysis. The presence of these so catalytic domain of ubiquitin-conjugating enzymes far unidenti®ed introns now explains published data on suggesting a potential role of TSG101 as a regulator aberrantly spliced mRNA products that were frequently in ubiquitin-mediated protein degradation (Koonin and observed in primary breast tumors. We show that a Abagyan, 1997; Ponting et al., 1997). The mechanism majority of shorter TSG101 transcripts are not the result underlying cell transformation after inactivation of of aberrant splicing events, but represent a fraction of both tsg101 alleles is not known. Potentially as a true alternative splice variants. Finally, we examined regulator of protein degradation, TSG101 could tsg101 expression patterns during dierent stages of in¯uence the half-life of other important tumor mammary gland development and in dierent transgenic suppressors and regulatory factors of the cell cycle. mouse models for breast tumorigenesis. Since inactivation of TSG101 was observed to be associated with abnormal microtubule organizing Keywords: TSG101 protein; mice; human; genomic centers and nuclear anomalies, it is suggested that the DNA; pseudogene; mammary gland absence of the TSG101 protein or the lack of individual TSG101 domains may lead to chromosomal instability and, thereby to a progression of tumorigen- esis (Xie et al., 1998). Introduction The human TSG101 gene has been mapped to a speci®c chromosomal region on chromosome 11, The tumor susceptibility gene 101 (tsg101) was discov- p15.1-p15.2 (Li et al., 1997), that is associated with a ered in mouse NIH3T3 ®broblasts using a random loss of heterozygosity in dierent types of tumors such homozygous knockout approach (Li and Cohen, 1996). as breast cancer (Ali et al., 1987) and Wilms' tumor The functional inactivation of this gene leads to cell (Reeve et al., 1989). However, rearrangements and transformation in vitro and to metastasizing tumors in somatic mutations within the TSG101 gene are rare vivo when transformed mouse ®broblasts were trans- events in human breast cancers (Steiner et al., 1997; planted into nude mice. Reversal of the neoplastic Gayther et al., 1997; Lee and Feinberg, 1997; Wang et properties was obtained in vitro after deleting the gene al., 1998). Instead of gene alterations, aberrant TSG101 that transactivated the antisense promoter using Cre-lox splice forms have been correlated to cell transformation recombination. These results suggested that transforma- (Gayther et al., 1997; Sun et al., 1997; Lee and tion and tumorigenesis were a direct consequence of Feinberg, 1997). In order to identify the structural TSG101 de®ciency (Li and Cohen, 1996). basis of the aberrant transcripts, the intron-exon The tsg101 cDNAs in mouse and human are 86% architecture of the TSG101 gene needs to be identical on the nucleotide level (Li et al., 1997). They determined. This information and the identi®cation of regulatory elements within the promoter will provide new insights into TSG101 transcriptional regulation, alternative and aberrant mRNA splicing events, *Correspondence: L Hennighausen Received 19 August 1998; revised 15 October 1998; accepted 26 resulting protein variants, and gene function in normal October 1998 and transformed tissue. Mouse and human TSG101 K-U Wagner et al 2762 Table 1. Evidently, the larger number of exons and Results introns of the murine tsg101 gene compared to the human homologue is the result of four additional introns Genomic organization of the mouse tsg101 gene within a region considered to be the ®rst exon of the The mouse tsg101 gene was cloned as multiple human gene (Li et al., 1997). The ®rst exon includes the 5' overlapping restriction fragments from two individual untranslated sequence and the translation initiation BAC genomic clones. Brie¯y, a cDNA that was codon. With 329 bp, exon 10 is the largest exon, but it generated by RT ± PCR from mammary tissue of a contains almost 75% of its nucleotides (240 bp) as 3' lactating dam was used to probe a 129/SvJ genomic untranslated sequence. A comparison of the coding BAC library (see Material and methods). Six positive region to the previously published mouse cDNA BAC clones were analysed and ®ve contained a novel sequence (Li and Cohen, 1996) revealed several processed tsg101 pseudogene but not the actual tsg101 dierences. An insertion of two G residues between gene. In order to distinguish the true gene from the position 37 and 38 of the published mouse cDNA shifts pseudogene, we designed a PCR assay that ampli®ed the ®rst ATG codon in position 33 in frame with the intron sequences (see Materials and methods). The tsg101 coding region. This results in the expansion of the sixth BAC clone contained the genuine tsg101 gene. tsg101 open reading frame of exactly 30 nucleotides Since this BAC clone lacked the 5' part of the tsg101 encoding ten additional amino acids at the N-terminus gene a second overlapping BAC clone was isolated. (Met-Ala-Val-Ser-Glu-Ser-Gln-Leu-Lys-Lys). The hy- Overlapping subclones were sequenced and the size of pothesis that the ATG at position 33 is the preferred the mouse tsg101 gene was determined to be 33.6 kb translational start site is supported by the fact that this (GenBank AF060868). Furthermore, we identi®ed ATG codon is ¯anked by a C-rich sequence motif in additional introns, which had not been predicted from position 76to71 and a highly conserved G in positon mapping studies (Li et al., 1997). In addition, we +4 (Kozak, 1997). A mismatch at position 81 (T to C) is determined dierences in the coding region of the similar to the human sequence and did not result in any mouse tsg101 gene to the previously published mouse amino acid substitution. Some additional changes of cDNA sequence (Li and Cohen, 1996). These single nucleotides were observed in the 3' untranslated dierences lead to an expansion of the tsg101 reading region immediately following the translational stop frame that encodes ten additional amino acids at the codon. N-terminus, which are highly similar to the predicted A prominent part of the TSG101 protein, an a-helix human TSG101 protein. domain that forms a coiled-coil structure, was The murine tsg101 gene contains ten exons (Figure 1). reported to be identical to CC2 and has been Sequences of intron-exon junctions are summarized in proposed to interact with stathmin (Maucuer et al., Figure 1 Genomic structure of the mouse tsg101 gene. The numbered boxes illustrating exons 1 ± 10 are not in scale. Solid bars represent known encoded protein domains Table 1 Nucleotide sequence of intron-exon boundaries of the mouse tsg101 gene 3' Acceptor Sequence 5' Donor Sequence Intron Exon Size Intron Size Intron Exon No. (bp) Exon Intron No. (kb) CCCTCTGCCT 1 72 GATGTCCAAG gtgagcccgc 1 6.3 tgcttttcag TACAAATACA 2 84 GATTCATATG gtaagtttat 2 2.6 tcccttttag TTTTTAATGA 3 65 CGTTATCCGAG gtaaataata 3 1.7 tctctcttag GTAATATATA 4 163 CTGGAAACAT gtaagtatta 4 1.8 tcctttttag CCACGGTCAG 5 126 CCACCAAATA gtaagtacac 5 3.5 ttatttttag CCTCCTACAT 6 66 CCAACCCCAG gtaatgaaaa 6 2.4 ttttctatag TGGTTATCCT 7 91 ACCACTGTTG gtaagtataa 7 8.6 tctcctgcag GTCCAGCAG 8 202 TCAAGAAGTA gtaagtaagc 8 1.2 ctgttttcag GCTGAAGTTG 9 239 GTTCCTGAAA gtaagtaccc 9 1.2 tccttcctag CACGTCCGCT 10 329 TGTTGCATAA Mouse and human TSG101 K-U Wagner et al 2763 1995). This particular domain is encoded by two of GAPBF2 known to exist in promoters of other the largest coding exons of tsg101, exon 8 and 9 ubiquitously expressed genes (Aki et al., 1997). (Figure 1). The proline rich region, which is known to be characteristic for transcription factors, is entirely The mouse genome carries a tsg101 pseudogene encoded by exons 5 ± 7.