Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

RESEARCH The Characterization and Localization of the Mouse Thymopoietin / Lamina-associated Polypeptide 2 and its Alternatively Spliced Products Raanan Berger, 1'4 Livia Theodor, 1'4 Jacob Shoham, 2 Ellen Gokkel, Frida Brok-Simoni, 1 Karen B. Avraham, 3 Neal G. Copeland, 3 Nancy A. Jenkins, 3 Gideon Rechavi, l's and Amos J. Simon ~

1Institute of Hematology, The Chaim Sheba Medical Center, TeI-Hashomer, affiliated with the Sackler School of Medicine, TeI-Aviv University, Israel; 2Life Sciences Faculty, Bar-Ilan University, Ramat-Gan, Israel; 3Mammalian Genetics Laboratory, ABL-Basic Research Program, National Cancer Institute-Frederick Cancer Research and Development Center, Frederick, Maryland 21 702.

Thymopoietins (Tmpos) are a group of ubiquitously expressed nuclear , with to lamina-associated polypeptide 2 (LAP2). Here we report the isolation and characterization of seven mouse Tmpo mRNA transcripts named Tmpo oL, f3, f3', ~, ~, 8, and i~. The c~, 13, and ~ Tmpo cDNA clones are the mouse homologs of the previously characterized human o~, 13, and ~/TMPOs, respectively, whereas Tmpo e, 8, and ~ are novel cDNAs. Additionally, the mouse Tmpo gene was cloned and characterized, it is a single-copy gene organized in 10 exons spanning -22 kb, which encodes all of the described Tmpo cDNA sequences, located in the central region of mouse 10. The almost identical genomic organization between the human and mouse , and the novel alternatively spliced mouse transcripts, led us to reanalyze the human TMPO gene. The human 13-specific domain was found to be encoded by 3 exons designated 6a, 6b, and 6c and not by a single exon as described previously. These findings suggest that there may be more human transcripts than currently recognized. The possible involvement of the new growing family of Tmpo proteins in nuclear architecture and cell cycle control is discussed.

Thymopoietin (Tmpo) was originally isolated tions, which suggested a tissue-specific expres- from bovine thymic extracts (Goldstein 1974) as sion pattern of various isoforms. a 49-amino-acid polypeptide (Schlesinger and Isolation of a bovine thymopoietin cDNA Goldstein 1975). The immunomodulating effects (Zevin-Sonkin et al. 1992) and subsequently hu- attributed to Tmpo and its putative active do- man cDNAs, encoding three related but distinct main thymopentin (amino acids 32-36, Arg-Lys- TMPOs (TMPO c~, 13, and ~/) (Harris et al. 1994), Asp-Val-Tyr) led to clinical trials using thymo- expanded our knowledge about Tmpo. Human pentin as a drug in several diseases such as rheu- TMPO c~, ~, and ~/ share an identical amino- matoid arthritis (Kantharia et al. 1989) and terminal domain of 187 amino acids, which is human immunodeficiency virus infection (Co- followed in TMPO~ by a specific domain (506 nant et al. 1992). Characterization of the amino amino acids). TMPOs ~ and ~/are closely related acid sequence of the polypeptide from a variety structurally, with TMPO~ differing from TMPO~/ of tissues (Audhya et al. 1981; Audhya and Gold- only by the insertion of a 13-specific domain of stein 1988) revealed several amino acid substitu- 109 amino acids after amino acid 220. Recently, a single human TMPO gene was isolated and characterized (Harris et al. 1995). The gene spans over -35 kb of genomic DNA, 4These authors contributed equally to this work. containing 8 exons that encode the three TMPO- SCorresponding author. E-MAIL [email protected]; FAX 972-3-530- spliced mRNAs. TMPO~ was found to be the hu- 3506. man homolog of the rat lamina-associated poly-

-370 ©1996 by Cold Spring Harbor Laboratory Press ISSN 1054-9803/96 $5.00 GENOME RESEARCH@ 361 Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

BERGER ET AL. 2 (LAP2) (Furukawa et al. 1995; Harris et contain a possible hydrophobic transmembrane al. 1995), an integral of the inner nuclear domain near their carboxyl termini (amino acids membrane. 409-432 of Tmpof3), suggesting a possible associa- In this study we report the isolation and mo- tion of these proteins with cellular membranes. lecular characterization of seven distinct mouse The sequence S/T-P-X-X is found seven times in Tmpo (locus designation) cDNAs that encode for the Tmpo[3 sequence. Two of these, T-P-R-K at six putative mouse Tmpo proteins. In addition, residues 255-258 and T-P-K-K at residues 319- the genomic structure and chromosomal local- 322, are especially likely to be cdc2 kinase sub- ization of the mouse Tmpo gene is elucidated. strates because of the basic residues in the X po- The differences in genomic organization between sitions (Nigg 1993). Tmpo~ is identical to Tmpo [3, the human and mouse genes, and the novel al- e, 8, and y in its amino-terminal domain through ternatively spliced mouse transcripts, led us to Gln 219. Its ORF, however, stops 5 amino acids reanalyze the human TMPO gene. downstream of this point followed by a distinct 1619-bp 3' UTR domain (Fig. 1C). RESULTS Figure 2 demonstrates a schematic presenta- tion of the different mouse cDNA clones. Mouse Analysis of the Mouse rmpo oL, f3, f3', ~, ~, ~, and Tmpo cx, [3, 8, e, % and ~ share an identical amino- Sequences terminal 186-amino-acid domain. Like the hu- The mouse Tmpo~ cDNA clone was isolated from man and the bovine Tmpo isoforms, amino acids cDNA library and characterized, using a 1-49 are highly homologous to the originally pu- 126-bp fragment, encoding Tmpo amino acids rified 49-amino-acid bovine Tmpo (Schlesinger 1-42, from the bovine clone cDNA 113 (Zevin- and Goldstein 1975). After Glu 186, Tmpocx di- Sonkin et al. 1992), as a probe. The same library verges from the other Tmpo. Tmpoe differs from was subsequently screened with a probe derived Tmpo[3 only because it lacks the [3-specific resi- from the amino-terminal 790 bp of the Tmpo¢ dues 220-259. Tmpo8 lacks both the [3 and the cDNA. One hundred fifty-five positive clones ~/[3-specific domains contained within residues were obtained. Repetitive screenings and restric- 220-291 of Tmpo[3. Tmpo'y is missing the [3, ~/[3, tion enzyme analysis revealed at least seven dis- and 8/~/[3-specific domains contained within tinct Tmpo transcripts. A representative clone amino acids 220-328 of Tmpo{3. RT-PCR analysis from each of them was chosen for further se- of mouse thymus total RNA confirmed the pres- quence analysis. The nucleotide and predicted ence of the novel Tmpo mRNA transcripts (not amino acid sequences of mouse Tmpo oL, [3, f3', ~, shown). Mouse Tmpo cx, [3, and y ORF sequences 8, ~/, and ¢ are shown in Figure 1. Examination of are 78%, 90%, and 91% identical to the previ- mouse Tmpocx sequence (Fig. 1A) reveals a short ously published human TMPO u, [3, and y se- region of basic amino acids (amino acids 188- quences (Harris et al. 1994), respectively, suggest- 194) suggestive of a nuclear localization domain ing interspecies conserved functions. (Kalderon et al. 1984), and a possible tyrosine phosphorylation site (amino acids 618-625) Genomic Organization of the TmpoGene (Patschinsky et al. 1982). The sequence S/T-P-X- Overlapping clones covering the entire Tmpo X, a potential recognition sequence for cdc2- gene were isolated from a BALB/c X genomic related kinases (Nigg 1993) is found 10 times library (Fig. 3). Restriction mapping and partial throughout the TmpooL sequence. Figure 1B sequencing of the clones suggested that Tmpo oL, presents the [3, [3', e, 8, and ~/ Tmpo sequences. [3, ~, 8, ~/, and ~ are produced via alternative The Tmpo [3 and [3' clones are identical in their mRNA splicing from a single gene. DNA sequence open reading flame (ORF) sequence but differ in of the relevant regions was obtained by sequenc- an additional 3'-untranslated region (3' UTR) se- ing of genomic subclones and by using internal quence starting in A 1715 of the [3' clone, probably primers. This made it possible to define the pre- because of an alternative polyadenylation signal. cise location of all the exons, the sequences of all Hydropathy analysis revealed that similar to the the intron-exon junctions, and the 5'-flanking human isoforms, all of the mouse Tmpo clones region of the Tmpo gene. Figure 3 schematically lack an amino-terminal hydrophobic signal se- presents the organization of the Tmpo gene and quence typical of secreted polypeptides (not the overlapping genomic clones used for map- shown). However, this analysis revealed that like ping and sequencing. The gene contains 10 ex- the human TMPO [3 and % Tmpo [3, e, 8, and ~/ ons spanning -22 kb genomic DNA.

362 @ GENOME RESEARCH Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

MOLECULAR CHARACTERIZATION OF MOUSE THYMOPOIETIN GENE

Thymopoi~liP a B Thvmopoietinz B. 8'. E. 8 and 7 C Thvmoooietin

GGC TGCAC T TGC TCC C CAGTc GTC,C GC CAGGGGC TTT TTGTGG CT~GGGCTCC~TTC~T~ACTT~TCCCC~TCGT~CAC~TTTTTGT~ CTCCCCAGTCGTGCGCCAC~TTTTTGTGG C GG~CGGAC C T GC~TT TTGTGTC C0 GA~TTCTGTC-CC GTGC C C GC C,~C GCC C,CTC C G CGC~CC, GACCT~TTTTGTGTCCC~CTGT~CGT~CC~C~TCCG CC,C~C GGAC CT GGGT TTTGT GTCCC GAGT TCTGTGCCGTGCCCGCC~,C C-CC C,CTC CG C,C~AC~GGAT C TC C C C,AC, C,C GGC GGGCC-CAC, C C C GGGAC C AGC GAC,CC~ C,CGC C GGC G C,C,C,CAGGGAT C TC C C GAGGC ~ C4?,C,C~ C C GGGAC ~ GAC,CGAC, C ~ ~ C~ G GGGCAGC~TC TC C C GAGGC GC,C GGGC GCAGCC CGGGACCAGCGAC-CGAC~ GC GC CGGC G T GAGCAGC GGC GGC GGC GAC T GT GAGGGGC C GG~GGAGGGGGAC GAGATG TC.AGC~ ~ ~ ~ ~CTGTC.AG~ C .CGAGATG T GAGCAGC GGC GGC GGC GAC T GTC.AGC-GGC C ~GAC.ATG M S M

C C GGAGTT C C TAGAGGAC C C T T C GGT C C T GAC C AAAGACAAGTTGAAGAC, C GAGT TC,GTC 60 CCGC~TTCCTAC~CCCTTC~TOCT~CC~GTTG~TT~TC 60 C C C,C,AGTT CC TAGAGGAC C C T T C GGT C C TGAC CAAAC,ACAAGTTC.A~ GAiT TGGTC 60 P E F L E D P S V L T M D K L M S E L V 20PEFLEDPSVLTKDKLKSZLV 20 P Z F L Z D P S V L T K D K L K S 8 L v 20

GC CAACAAC GTGAC GC T C C C GGC C GGC GAGC AGC GC AAGGAC GTGTAC GTC, CAGC TC TAC 120~C~C~CGTGAC~TCCC~C~~GC, ACGTGTACGT~TCTAC 120 GC CAACAAC GTGAC GC TC C C GGC C C,GCGAGCAGC GCAAGGAC GTGTAC GTC,CAC, CTC TAC 120 A ~ N V T L V .~ C, Z 0 It X D v "/ V O L Y 40AMNVTLPAGEQRKDVYVQLY 40 A N N V T L P A G E Q R I( D V y V Q L Y 40

c TGCAGCACC TCACGGCGCGCAACCC-C, CCGCCC,CTCGCCGCC~CAACAC, CAAAC~ 180CT~A~ACCTCAC~CCG~C~C~TC~C~GGGAGCC~A~ 180 C T GC AGCAC C TCAC GGC GC GCAAC C GGC CGC C GC TCGCCGC GGGAGC CAACAC~A~ 180 L O H L T A a N ~ ~ V L A A G A N S K S 60LQHLTARNRPPLAAGANSKG 60 L Q H L T A R N R P P L A A G A N S K G 60

C C GC C C GAC T T C T C GAGC GAC GAGGAGC GC GAC C,CC AC GCC GGTGC TC GGC T C C GGGGC C 240CC~CC~CTTCTC~C~A~C~C~C~T~TC~TCCC.~C 240 C C GC C C GAC T TC TC GAGC GAC GAGGAGC GCGAC GC C AC GC C GGTC-C T C GGCTC C GC~C 240 P P D F S S D E E R D A T P V L G S G A 80PPOFSSDEERDATPVLGSGA 80 P P D F S S D E E R D A T P V L G S G A 80

T C C GTGGGTC GC GGC C GC GGC C-CC GTC GGCAGGAAAGCCAC AAAGAAAAC TGATAAGC C C 300 TCCGT~TC~C~CGTC~AGGAAAGCCACAAAGAAAACT~TAAGCCC 300 T C C GTGGGT C GC C,C,CC C,CGGC GC C GTC GGCAC~ CACAAAC.A~C TGATA~C C 300 5 V G ~ G a ~ A V G ~ K A T X K T I) X P 1005VGRGRGAVGRKATKKTDKP 100 S V G R G R G A V G R K A T K K T D K P 100

AGGC TAGAAGATAA~T GATC TGGAT GT GACAGAGC TC T C TAATGAAGAAC T TC T GGAT 360AGGCTAG~GATAAAGATGATCT~TGTGACA~TCTCT~T~CTTCT~T 360 AC-GC TAC~TAAAGATGATC TGGAT GT GACAGAGC TC TC TAATGAAC, AAC T TC TGGAT 360 R L E D K D D L D V T E L S N E E L L D 120RLEDKDDLDVTELSNEELLD 120 R L E D K D D L 0 V T E L S N E E L L D 120

CAGC TT GTAAC,ATAT GGAGTGAAT C C TGGTC C CATT GT GGGAAC AAC CAGGAAGC TATAT 420 C~TTGTAAGATAT~AGT~TCCT~TCCCATTGTGGGAAC~CCA~TATAT 420 CAGC T TGTAAGATAT GGAGTGAATC C T GGT C C CAT TGT GGGAACAAC CAC-~C,C TATAT 420 Q L V R Y G V N P G P I V G T T R K L Y 140QLVRYGVNPGPIVGTTRKLY 140 Q L V r Y G v N P G P I v G T T r K L Y 140

C,AGAAGAAGC TGT T C~C,C TGAGGGAGCAGGGAAC T GAAT C GAGATC C T C TAC TC C TC T T 4800,AGA~TGTTG~T~A~AC~CT~TC~TCCTCTACTCC~TT 480 GAGAAGAAGC T GTTGAAGC T GAGGGAGCAGGGAAC TGAATCGAGAT C C TC TAC TC C T C T T 480 K K L L K I. r z Q G T ~ S I~ S S T ~ L 160KKKLLKLREQGTESRS'STPL 160 E K K L L K L R E O G T E S R S 8 T P L 160

c CAAC AGT C T C T TC C T C T GCAGAAAACACAAGC.C AGAATGGAAGTAAC GAC TC T GACAGA 540CC~CAGTCTCTTCCTCT~AGAA~CAC~A~TGGAAGT~C~CTCT~A~ 540 C CAACAGT C T C TT C C T C T GC AGAAAACACAAGGCAGAAT GGAAGT AAC GAC T C TGACAGA 540 T V S S S A ~ N T r 0 N G S N D S D r 180PTVSSSAENTRQNGSNDSDR 180 p T V S S S A Z s T R 0 ~ G S S D S D r 180 & TACAGC GAC AAT GAT GAAGGAAAGAAGAAAGAAC ACAAC,AAAGT GAAGTCC GC TAGGGAT 600TACA~C~T~T~G~CTCTAAAAT~T~~GGGAGCC~TA 600 TAC AGC GACAAT GAT GAAGAC TC TAAAATAGAGC T GAAGC T T GAGAAGAGC, GAGC C GC TA 600 Y 5 D N D E G K K K E H K K V K S A R D 200YSDNDEDSKIELKLEKREpL 200 Y S D N D • D S X I E L K L Z K. ~ Z P L 200

TGTGTTCC TTTTTCTGAACTTGCATCTACTCCCTC TGGTGCATTTTTTCAGGGTATTTC T 660~GGGCA~C~CAGTCACACTGAAGCAGAGAA~CT~C~TCAG~ 660 AAGGGC AGAGC AAAGAC C,CCAGT CACAC T GAAGC AGAGAAGGAC TGAGCACAATCAC4GTA 660 C V P F S Z L A S T V S G A F V Q G Z S 220KGRAKT~VTLKQ~RTEHNQS 220 K G R A K T P V T L K O R R T Z H N Q V 220

TTC C CTGAAATC TCCAC C CGTC C TC CTTTGGGCAGGACTGAACTGCAGGCAGC TAAGAAA 720TATTCTC~T~GT~CT~CT~T~C~T~TCTTC~CAF~CCT 720 TTTGTAGTTTTATGATCCCCGTGTC CAGGTGTGTCATCTGAC TAGTATCATCCACCCCAT 720 F P E I S T R P P L G R T E L Q A A K K 240YSQAGVTETEWTSGSSTGGP 240 F V V L * 224

GTACAAAC CACTAAGAGAGAC C CACC TAGGGAGACTTGTAC TGATACGGC CTTGCC GGGA 78~T~A~ATT~CTAGGGAGTCCACC, AGA~TCGAGAA~CTCCAAGGAAAAGG~G 780 v 0 T T K ;R D ~ ~ I~ z T c T D T A L ~ G 260LQALTRESTRGSRRTPRKRV 260 GTC T TGAC C TT T GGT GCAOAC AC GC C GT GCAGC C T G,CTC AC C C CAC T GAC T GGAT TCAT G 780 GTT GAGTGGT GTGGGGCACAT ACT TTGT GAGATGTGT OATC AGCATAGC C TC C GTTGT TT 840 AGGGGCAGACC GCACAAGTTAGC CCC TGGAC GGAGTTTATTTATTC CTTCAGAGC TATGC 840GAAACTTCACA~ATTTTCGTAT~T~T~AGT~TTT~GTACTCCCATA~T 840 ATT GGAT TTAAGTAGT C TAGCAT TAGTCAT GC TGCC TGTTTC TGATT T GGGC T CAGCAGT 900 r G r ~ ~ K L A ~ G R S L F 1 ~ S z L C 280ETSQHFRIDGA~ISESTPIA 280 TTTGAACCTGGGGTCTCC~TATC-GCAGGCCAGCCTTCCACOCTCCTGCTTCTTCCTCTT 960 GAGAT GATTGC T TTAGAAGATAAAAT C,CAC C C TTT TTGATGCAAC T T T C.AAAGC T C GGGA 1020 TATGATAGATGTGTAGAGAAAAGTTC TTCACCATCTTC TCAC, C GTC,AATTTGC TGCCAGG 900 GAAACTAT~TTCAAGC~CG~TCCTT~T~C~TA~TT~TGGA~TTTC 900 AAGTTTCATTTGTGTCTAGATTTCTCA~TTGAATTTCCTTTATTTGT~_,CTTCTTAACAC 1080 Y 0 R C V E K 5 S 5 P S S Q R E F A A R 300 ETIKASSNESLVANRLTGNF 300 TCTC,AAC-CTAGCTAGTATGGTACATTATTTAAAATGACAATTTCCTCTTTTGGGGGGATG 1140 GGACAGGAAGGC-CAAGGTAGAGTTTT CAAAAGA~C T C~TGAGCATAC, GCA 1200 TTGGTCT CT GCT GCAGC T TC TC CTTCACTGATAAGAGAAACCACTACTAC TTAC T CTAAG 960 ~AT~ATCTTCTATTCT~C~TCACT~TTCTCA~T~CCAGA~CACCA 960 TCAATAAAGACCTTCTTTTACAGAG~TGACTGGCTCTGATTTAGCCTCCACACCCT 1260 L V S A A A 5 P 5 L IIE E T T T T Y S K 320 KHASSILP&ITEFSDITRRTP 320 AGGTTCTTTTCCTCATCTTTACTCGTTAGCTTGAAAATGTGTCAGTTTGTTGTTCATCTA 1320 TCTTTTCCATGTTCTGTCTGCCTCTGGTTTCTGTGGTGTTTTGA~TACTTCACAAT 1380 GAACATAGTGAAAATATTT GC CGTGGAGGGAAGAGTAGAGCTCAGC CATTGC GTGCTGAG 1020 ~G~CCATT~C~T~GTGGGAGAAAAAACAGAGGA~GTA~CA~ 1020 GAAGCCCTC, GCCTCCCTGACCAGGCTGGTCTCTAACTTGTAF~AGCTTTCTGCTTCCTGAG 1440 E H 5 E N I C R G G K S R A Q P L R A E 340 KK PLTRAEVGEKTEERRVDR 340 AGATGGGATTACAGAGATGGGATCCCAGTTGTGCACCGACAGTCCTGCTTCCTTCCTTCC 1500 GTGTGATC GGAGGGGAGAGGCAGC, GTC C TGT TC TC C TGACC C TT TACAGGAC TACAGGAG 1560 GAGC C TGGC GT GTCAGATCAGT CAGTTTTCTC TAGTGAAAGAGAAGTACTGCAGGAGTCG 1080 GATATTCTTA~TGTTCCCCTACG~CTCCACTCC~GG~TCAGT~TA~ 1080 TGTACCACCAATCCTGGCCA~TGGTTTACTC, CATAATT~u~GTTGAGTCTC~CAAT 1620 E P O V S D Q 8 V F S S E R E V 5 Q E $ 360 DILKEMFPYEASTPTGISAS 360 ACGTCTATACTAATTCCCACCGTGGCTGTTTCTCCGTCCTCCGTTTGGCC, ACCGTAGTTA 1680 AC CAT GT C GGTAT TT C TGC TTGC TGCAGGT C,GCAGGTGGT TTT CAAGGGT TTTATCTGTT 1740 GAGAGGTC GCAGGTGATTTCTC CACCAC TTGC TCAGGC~.ATTCGAGATTATGTCAATTC T 1140 T~C~AGACC~TC~T~T~C~C~TC~TCAGT~CTTCAGGATG 1140 TCTTGATACCTGTGTAACTGGCATACTGG~TAGACCTGCATGTGGCAGTGCAGAGACT 1800 E R $ Q v I $ P P L A Q A 1 R D Y V N S 380 CRRPIKGAAGRPLELSDFRM 380 TGGGCCTTTTTTCCTTTTGTAGTCACAATAAAAATGTTTGTCAGTTCTGG~TAGTCTCG 1860 TT TTAAC T GT TAGAGT TAGAC TAT GGACTC C TATCTATTTAAATAGAAC TATGC~ TGG 1920 CTGTTAGTCCAGGGTGGG&TC GGCAGTTTGC CTCGGAGTTC TGATTCTGTGCCCACAC TG 1200 ~GTCGTTCTCATCT~GTACGTCCC~GTAT~TCCCTT~A~TGTCAAGTCA 1200 TGACTACTGTATTGGACAGCTTAGCCCTAAAGGGTTTGTCTGGGTTTGGGTTCAGTTTCT 1980 L L V Q G G V G S L P R S S D S V P T L 400 EESFSSKyVPKYAPLADVKS 400 TTGAAAAGAATACTTCATAAATGATGTTGTAGCCTTCCTACTGATCATTTCAF~TAT 2060 GAATTT C C TTCAAAAC TATYAAAATTAATTAATAGTTCAGC TGT TGGTC TT TTCAGTTAA 2100 GATGTAGAAAACATATGTAAGAGACTTAGCCAGTCCAGC TAT CAAGA~TC TGAATC CCTG 1260 GA~GAAGA~C~TCCGTTCCCATGT~TAAAAATGTT~TGTTT~CCTT 1260 GGATTTTAACAGCTGTGAATAATACTGCCTAAACCCGTGACTTTATTATAGTCACAAAGT 2160 D V E N I C K R L S Q S S Y Q D S E S L 420 EKTKKRRSVPMWIKMLLFAL 420 GATGACTTCTCACCATGTGATTTGATATCTGTGACTGATTACCTCATGTTCCTGGGTAAG 2220 AT C C T TAGC T GC C TTAAAGT TAAGT T CAAATAAAACAGC TAC~C TC~TTTGAGA 2280 TCCCC TCC C CGAAAAGTCC CAAGACTCAGTGAGAAGCCAGCAAC~CGGGGACTCAGGC 1320 ~T~GTGTTTTTTGTTTTT~TCTATC~TAT~AAACC~CCAAGG~TCCCTTC 1320 TGCGGTTTCTCTGT 2294 S P P R K v p R L S E K P A ]% G G D S G 440 GACFLFLVy~A...._.MM ETNQGNpF 440 TCCTGTGTGGCATTTCAGAATACACC TGGATCCGAACATAGGTCTTCTTTTGCCAAAAGT 1380 ACT~TTTTCTTC~GATACTAAAATATCC~CTGA~TCATC~CACATTC~CTT c v A F Q N T P G S E H R s s F A K S 460 1380 TNFLQDTKISN, 451 GTTGTC TCTCATTCTCTGACTACTTTAGGAGTAGAAGTGTC TAAGCCACCACCACAC, CAT 1440 V V S H S L T T L G V E v S K P p P Q H 480 GATCTCCTGTTTTTAATAACTGTAGAAAAACATTTGTGTCCACTTGTTGACTGAAC~CT 1440 AAATTGTGATTTCACCTCAGT~%AGTTAGTC, CTC,CATTC~CAG~TGCTTAC 1500 G~TAAAATAGAAC-CC TCAGAGCC GTC TTTTC CTCTCCACGAATC TATCCTAAAAGTGGTG 1500 ACGGATCTCATTTCAATATTTTGGACCTTGC, AC-ATCACATTGTGCCATATGAATAATTTT 1560 K I E A S E P S F P L H E S I L K V V 500 TTTAGCTCCA~AACTTTTTTGGTAGGCTTTATTTTTTTAATGTC, C~CATCTTATTTCACT 1620 TTTC,GGGAAAATGCATTGTTTTGAGTATTTC.A~TAAAGGCAAAACATGGTC, AGTAAT 1680 GAAGAGGAGTGGCAGCAAATTGATAGGCAGCTGCC C TCGGTGGCGTGCAGATATC CAGTT 1560 GTGAAGCTACACATTAAATACTTGG~TTCTTACC,~TTTATGACTTATTCTCTG 1740 E ~ E W Q 0 I D I~ 0 L ~ S V A C r Y ~ v 520 CTGAGTAAAAATGTTACAAATGTGAATGAGGTTCUTATG~CC~_,CCACC~TT~TG 1800 T C C T C CAT C GAGGCAC-C AC GGATAT TATC AGTAC CAAAAG T GGATGAT GAAATC C T G~ 1620 CTTCCTGTACGCAGCGTTTGTTCATGGAAGGTACAGCAATAAGCTCT~GTGTGACTCCTU 1860 S S I K A A R I L S V P K V D D E I L G 540 TCATGTGGTAGTGCTGGTGCC, C-CCTTACCATACCACTCGCTAGAGTATCTGTCATATCAT 1920 GGAGGAATTC ]930 TTTATTTCTGAAOC CACCCCAC GAGCAGCTA0 TCAGGCGTCC TCCAC TGAGTCTTGTGAT 1680 F 1 S E A T P R A A T Q A S S T E S C D 560

AAACATTTGGAC TTAGC TC TC TGTAGATCTTACGAAGCTGCAGCATCAGCATTC, CAGATT 1740 K H L D L A L C R S Y E A A A S' A L Q I 580

GCAGCC CACACGC-C C TTCGTAGC TAAGTCTCTGCAAGCTGACATAAGTCAC, GCTGCACAG 1800 a A H T A ~ V A ~< S L 0 A D I S Q A A Q 600

ATTATTAATTCGATACCTAGTGAT GCACAGCAGGCACTCAGGATTC TGAACAGAACC TAT 1860 I I N S I P $ D A Q Q A L R I L N R T ¥ 620

~ATC-CAGC TTCATATC TTTGTGATC, C TGCATTTGATGAAGTC, AGC.ATGTC TGCC TGTGCC 1920 D A A S Y L C D A A r D Z V R M S A C A 640

ATGGGGTC T TC TACCATGGGTCGGCGTTATCTGTGGTTGAAGGATTGTAAGATTAGTCCA 1980 M G S S T M G R R Y L W L K D C K I S P 660

GC T TC TAAGAATAAATTGACTGTTGC CC CC TTTAAAGGTGGAACATTATTTGGAGGGGAA 2040 a S K N K L T V A ~ F K G G T L F G G E 6B0

GTAC AC AAAGT T AT T A~GC G T GGAAATAAGC AGTAAT GAGGT TAAGAACAAAGGC AG 2100 V H K V I K K I~ G N K Q * 69~

AAAC T TTGACTTAGAGAATTTGTGAAC CT T TC TAAGAC TTTAGTC T CC TAATAGAGC TAA 2160 TTTGGAATTC 2170 Figure I Nucleotide and predicted amino acid sequences of mouse Tmpoc~(A), Tmpo 13, 13', ~, 8, and 7 (B), and Tmpo~ (C). The sequences are numbered so that amino acid +1 is the amino-terminal proline of mature Tmpo and nucleotide +1 is the first nucleotide of the proline codon. The predicted amino acid (single-letter code) is depicted under the middle nucleotide of the corresponding codon. (A) The arrow (G 187) indicates the beinning of the unique c~-domain. The two underlined sequences are a short region of basic amino acids (residues 188-194) suggestive of a nuclear localization domain and a consensus sequence for tyrosine phosphorylation. (B) Nucleotide and amino acid numbers are for Tmpo13. Arrows above amino acids D 187, S22°, V 26°, V 292, and V 329 indicate the beginning of f3/e/~/'y/~, 13, f3/~, 13/~/~, and 13/~/~/7 domains, respectively. The underlined sequences are the cdc2 kinase consensus sequences (residues 255-258 and 319-322), a hydrophobic domain in Tmpo ~, e, ~, and 7 (residues 409-432), and one AATAAA (nucleotides 1655-1660) polyadenylation site. The sequence downstream to G 1716 is unique to Tmpo13'. (C) The two possible polyadenylation sites are underlined. Nucleotide sequences for the mouse Tmpo mRNA have been deposited with the GenBank data base under accession nos. U39073 (Tmpoi~), U39074 (Tmpol3), U39075 (Tmpoe), U39076 (Tmpo~), U39077 (TmpoT), and U39078 (Tmpoo0.

GENOME RESEARCH @ 363 Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

BERGER ET AL.

50 aa encoding amino acids 292-328 of I I 1 186 692 Tmpof3. Exons 6, 7, and 8 are rela- mTPo~ I tively close together (Fig. 3), encod- 1 2 3 4 ing amino acids 220-328, which are homologous to the human 1 186 219 259 291 328 451 13-specific domain (residues 221- 329) encoded by exon 6 (Harris et I 2 3 5 6 7 8 9 10 al. 1995). Exons 9 and 10 are ~//B/~/ ~-specific domains. Exon 9 encodes 1 186 219 251 288 411 amino acids 329-358 of Tmpof3, I l- ' and exon 10 encodes amino acids 1 2 3 5 7 8 9 10 359-451 of Tmpof3 and further en- codes the 3' UTR of Tmpo 13, 13', e, 8, 186 219 256 379 and ~. mTP8 1 2 3 5 8 9 10 Exon 6 of the Human TMPO Gene

1 186 219 342 Contains Two Small lntrons mTPy [ The difference between mouse and 1 2 3 5 9 10 human Tmpo gene organization regarding the g-specific domain 1 186 219 224 (residues 220-328), coupled with the lack of the entire human exon 1 2 3 5 15' i 6 sequence in the GenBank data Figure 2 Schematic diagram of the various mouse Tmpo transcripts. base (accession nos. U18269 and The numbers above the bars indicate amino acid position, and the num- U18271), led us to characterize this bers below indicate the coding exons. The open bar (1-186) depicts the region in the human TMPO gene. identical amino-terminal domain. The hatched bar (186-692) in TmpooL Human genomic DNA was ampli- is the unique s-domain. The solid bar (189-219) is the 13, ~, 8, % and i~ fied by PCR (see Methods). A -750- domain. The lightly stippled bar (219-259) is the 13-specific domain. The bp PCR fragment was cloned and hatched bar (259-291 of Tmpo[3) is the ~/13-specific domain. The verti- characterized. Figure 4 presents the cally striped bar (291-328 of is the 8/e/13-specific domain, and Tmpof3) genomic sequence of the human the heavily stippled bar (328-451 of Tmpo13)is the ~//8/~/13-specific do- main. The arrow indicates the last 5 amino acids of Tmpo~. TMPO exon 6 region. This region contains two intronic sequences, dividing exon 6 into three smaller exons, termed 6a, 6b, and 6c. These Analysis of sequences of exon borders re- exons are organized in the same pattern as exons vealed that exon 1 encodes the 5'-untranslated 6, 7, and 8 of the mouse Tmpo, respectively. region and amino acids 1-91, exon 2 encodes These data strongly suggest that like the mouse amino acids 92-133, and exon 3 encodes amino Tmpo gene, the human TMPO gene contains 10 acids 134-186. These exons are spliced to form exons. the common sequences present in all seven mouse Tmpo genes. The s-specific domain (resi- dues 187-692) and its 3' UTR are encoded by the Sequence Analysis of Exon!Intron Borders large exon 4. Exon 5 encodes amino acids 187- 219 of Tmpo [3, ~, 8, ~, and ~. Exon 5' is the 3' UTR All splice sites for the distinct mouse Tmpo mR- of Tmpo~ and extends downstream from exon 5 NAs contain the canonical GT and AG dinucleo- without an intron between them. Hence, the tides at the intron borders (Table 1). The splice 5-5' border region is functioning as a donor- sites match consensus splice site sequences to splicing site for the alternatively spliced Tmpo f3, varying extents. The mouse and the human 3'- e, 8, and % Exon 6 is the g-specific domain and splice sites share a significant homology (Table encodes amino acids 220-259 of Tmpof3. Exon 7 2). The polypyrimidine tracts of the mouse exon is the ~/13-specific domain encoding amino acids 4, 6, 7, and 8 3'-splice sites are highly identical to 260-291, and exon 8 is the 8/~/~-specific domain the polypyrimidine tract of the human exon 4,

364 @ GENOME RESEARCH Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

MOLECULAR CHARACTERIZATION OF MOUSE THYMOPOIETIN GENE

1~ scription start site. Recogni- tion sequences for the tran- scription factor Spl (Ka- donaga et al. 1986) are P X B B B B E H B E E E E EB found at positions + 65, +

2 3 4 51 5' 27, -52, -83, -119 and I I - 198. Two GCCAAT boxes, , I potential binding sites for members of the CTF/NF-1 6B family of transcription fac- 3A tors (Santoro et al. 1988), are i I present within direct repeat 2B sequences in positions + 35 I I and + 68. Several potential - binding sites for the Even- Figure 3 Physical map of the mouse Tmpo gene. (A) Exons are numbered and skipped (Eve) homeo box indicated as boxes. Introns are indicated by a thin line. Sites for restriction protein are present at posi- enzymes BamHI (B), EcoRI (E), Hindlll (H), Pvull (P), and Xhol (X) are shown. The tions - 14, - 78, - 143, and various depictions in the exon bars (e.g., open, solid, hatched, etc.) correspond -221. to the encoded regions in Fig. 2. Exons 5' and 10' encode TP~ 3' UTR and TPI3' unique sequence, respectively. (B) Overlapping genomic clones used for map- ping and sequencing. Tmpol3 and LAP2 are Homologous Proteins 6a, 6b, and 6c 3'-splice sites, respectively. Inter- Data base comparison revealed that all human estingly, unlike the other, less conserved 3'-splice and mouse Tmpo cDNA sequences share a re- sites, these 3'-splice sites are participants in alter- markable identity and homology with the native splicing events. nuclear LAP2, isolated from rat. Figure 6 shows the comparison between the predicted amino acid sequence of the mouse Tmpo[3, the LAP2 Analysis of the 5'-flanking Region of the (GenBank accession no. U18314), and the hu- Tmpo Gene man TMPO[3 (accession no. U09087). The mouse The TSSG program analysis (Prestridge 1995) of Tmpo[3 and the rat LAP2 are 96% identical (Fig. the 5'-flanking sequence of the Tmpo gene revealed a predicted transcrip- tion start site located 378 bp up- 661-AGCTATTCTCAAGCTGGAATAACTGAGACTGAATGGACAAGTGGATCTTCAAAAGGCGGA stream of the translation initiation Exon 6a CCTCTC'GCAGGCATTAACTAGGGAATCTACAAGAGGGTCAAGAAGAACTCCAAGGAAAAGG-780/ codon (Fig. 5). The mouse and the human Tmpo 5'-flanking regions gtgatgcaaggcttattccttgggttttcagatttgtagggtttagtattatttatattt share conserved promoter sequences at tgtt tt tgtt tt gtt t caaa ct aa cag/78 I- GTGGAAACTTCAGAACATTTTCGTAT around the predicted transcription Exon 6b AGATGGT C CAGTAATTT CAGAGAGTACT CCCATAGCT GAAACTATAAT GGCTT CAAGCAA start site. Several potential binding .... sites for known transcription factors CGAATCCTTA-872/gtaaatatgtttcgtaaactatacaagtggtattctttgtaaatt were analyzed (Fig. 5). Like the hu- accctttaattggaaatcggggag... ~300 bp of in%zonio sequence ... man gene, no obvious TATAAA se- taagtgtctgtgttatgtttggataattctgagtctgaataatttgaatcttggcag/873- quence, a known binding site for the general transcription factor TFIID- GTTGT,'GTCAATAGGGTGACTGGAAATTTCAAGCATGCATCTCCTATTCTGCCAATCACTGAA Exon 6o TBP, could be found in the usual po- TTCT CAGACATAC CCAGAAGAGCACCAAAGAAACCATTGACAAGAGCT GAA- 987 sition -30 bp 5' to the transcription Figure 4 Nucleotide sequence of exon 6 region of the human TMPO start sites. This absence is character- gene. The exonic sequences are underlined and depicted as uppercase, istic of some other genes expressed in designated 6a, 6b, and 6c. Nucleotides of the intronic sequences are in many tissues. However, a closely re- lowercase. The numbers that flank the exonic sequences correspond to lated sequence (TTTAAA) is present the numbered base pairs of the human TMPOI3 gene (Harris et al. 29 bp upstream of the predicted tran- 1994).

GENOME RESEARCHO 365 Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

BERGER ET AL.

human has revealed numerous regions of homol- Table 1. Sequences of the exon/intron ogy between the two species, and this is clearly boundaries of the mouse Tmpo gene demonstrated between the central to distal por-

5' INTRON EXON 3' INTRON tion of mouse chromosome 10 and human chro- 1 -GGCAGG GTAAGGCGAACCCCCCCGGA mosome 12. The human homologs of Tmpo and TAATTGTACTTTGTTTGCAG AAAGCC- 2 -TTGTGG GTAATGGGTTTTATTTGTTT Mgf map to 12q22, and the human Igfl locus GAACTTCTCCCTTAACCC~ GAACAA- 3 -ATGAAG GTAACATTTTAACTGCTCTT maps to 12q22-q23. TGCCTCTTTTGCCTCTACAG GAAAGA- 4 =-eamain

TTCCTCGATGTTATTTCCAG ACTCTA- 5 -AATCAG GTATTTGTAGTTTTATGATC ATGTGTCGATGCTTGACTAG AGCTAT- 6 -AAAAGG GTGACGCAGGGCTTGCCGCT DISCUSSION TTCTGTTTTCAAACTAACAG GTGGAA- 7 -TCCTTA GTAAATATGTCTTATAATCT GATAATTTGAATCTTGGCAG GTGGCC- 8 -GCTGAA GTAAATGGATACCATTTAGC In this study the genomic structure and the vari- GTTTCTTTTTTTCTCACTAG GTGGGA- 9 -AATCAG GTACCTGGATGGATAAAACT ous alternatively spliced transcripts of mouse TTCTGTTTCACTTTG~CAG TGCTAG-10 Tmpo are reported. The mouse Tmpo gene, which (1) The consensus (AG) and GT) dinucleotides at the in- gives rise to at least seven alternatively spliced tron borders are in boldface type. mRNA transcripts, is encoded by a single gene (2) The intronic sequence downstream of exon 5 is the containing 10 exons. The novel mouse Tmpo amino-terminal sequence of exon 5'. mRNA, together with the similar genomic orga- nization between the mouse and the revised hu- man TMPO gene, and their 3'-splice site homol- ogy (Table 2), suggest a possible existence of 6), suggesting that Tmpo[3 is the mouse homolog more human alternatively spliced Tmpo tran- of the rat LAP2. scripts. This was confirmed by RT-PCR on hu- man thymic mRNA (not shown). Moreover, Har- ris et al. (1994) demonstrated the cross-reaction The TmpoGene is Located in the Central Region of their anti-human TMPO antibodies with the of Mouse Chromosome I0 75, 51, and 39 kD, of c~, [3, and 7 isoforms, respec- The mouse chromosomal location of Tmpo (lo- tively. However, these antibodies also cross- cus designation, Tmpo) was determined by in- reacted with another band, of -43-44 kD [Fig. 2 terspecific backcross analysis using progeny de- in Harris et al. (1994)], which we suggest to be the rived from matings of (C57BL/6J x Mus human TMPO~ gene. spretus)F 1 x C57BL/6J mice. This interspecific The absolute and relative abundances of backcross mapping panel has been typed for Tmpo ~, [3, and ~/mRNAs appear to vary in differ- >2000 loci that are well distributed amoung all ent tissues and cell lines (Harris et al. 1994; Berger the autosomes as well as the X chromosome et al. 1995), suggesting that both expression and (Copeland and Jenkins 1991). C57BL/6J and M. alternative splicing of Tmpo may be regulated in spretus DNAs were digested with several enzymes a tissue-specific manner. One possible mecha- and analyzed by Southern blot hybridization for nism for control of the formation of Tmpo c~ and informative restriction fragment length polymor- phisms (RFLPs) using a probe derived from the mouse cDNA. The mapping results indicated that Table 2. Comparison between the Tmpo is located in the central region of mouse mouse and the human 3'-splice sites chromosome 10 linked to -like growth fac- tor-1 (Igfl) and mast cell growth factor (Mg~ Fig. Mouse 3'-splice site Human 3'-splice site 7). Although 178 mice were analyzed for every TAATTGTACTTTGTTTGCAG /Exon 2 TTACTGGACTTTGTTTACAG /Exon 2 marker and are shown in the segregation analysis GAACTTCTCCCTTAACCCAG /Exon 3 CAAGTTCTGCCTTAATCCAG /Exon 3 TGCCTCTTTTGCCTCT~XTAG /Exon 4 TGCCTCTTTTGCCTCT~.C~ /Exon 4 (Fig. 7), up to 196 mice were typed for some pairs TTCCTCGATGTTATTTCCAG /Exon 5 TTCTCCAATGTTATTTCCAG /Exon 5 of markers. Each locus was analyzed in pair-wise ATGTGTCGATGCTTGACTAG /Exon 6 A~QTGTTGATGCTTG~ATAG /Exon 6a b combinations for recombination frequencies us- TTCTGTTTTCAAACTAAC~ /Exon 7 I GTTTTGTTTC.AAACTAACAG /Exon 6b ing the additional data. The ratios of the total AATAATTTGAATCTTGGCAG /Exon 8 AATAATTTGAATCTTGGCAG /Exon 6c number of mice exhibiting recombinant chro- GTTTCTTTTTTTCTCACTAG /Exon 9 GTTTGTCTGTTTCTTATTAG /Exon 7 mosomes to the total number of mice analyzed TTCTGTTTCACTTTGAACAG /Exon 10 I CCTCCTTTCACTCCCAACAG /Exon 8 for each pair of loci and the recombination fre- Identical sequences between the murine and the human quencies between the loci are shown in Figure 7. 3'-splice sites are in boldface type and underlined. Comparative gene mapping in mouse and

366 @ GENOME RESEARCH Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

MOLECULAR CHARACTERIZATION OF MOUSE THYMOPOIETIN GENE

-382 CCGAGGGAT T TGACTTCCTGCGCATCCCGCACCACAGCCTGTCTGGT T T T such as the Spl, Eve, CTF/NF1, and TTTAAA do- -332 CACGCTT CCCC TCCAGGGCCAGCAC TCGGAGGGGGCAAGCCAGT CGGCCG mains. The functional significance of these mo- -282 CCAT TCT TCAGGGGGGCT CGGTGT CCCGGGAAGGAGCCAGGCCCGGGCAA tifs has not yet been studied; however, the sig-

-232 AGGGTC C TCAAGAGAGC T GGCGCAGAACTCCGCCCGT CCGAAGGCGATCG nificant evolutionary conservation further sup-

-182 AGGCCGGGGACGAGGCT T C TCCGGTGCAGGGGGCCAGGAGAGAGGGAAAC ports the essential role of Tmpo in diverse cellular Spl functions. -132 AGCGT GCAAAACCCAGTACCGCCCATCCC T TATCAGCGTCCGAGGGGAAG Spl TATA- The Tmpo gene is located in the central re- - 82 GGCGGAGAGCGAGGAAACAGCGGGCCCCCCAGCTCCCGCCCAGCGCCTTT like EVE $ gion of mouse chromosome 10. We have com- - 32 TAAACTGCGTTTCTGCACCTCTCTGCCCTGCGCCGCGTGCGGCTGGAACC CTF/NFI +I -qpl pared our chromosome 10 interspecific map with + 19 CGCGCGAGCCGCCCAGGCCAATGGGTGGCGCCGCTTCCTCCGGGCGCCCG CTF/NFI a composite mouse linkage map that reports the + 69 C CAATGGCCGGCGCGCGTTCTTGGGGCGTGCGCGAGCCAGGCTCCCCCCA map location of many uncloned mouse muta-

+ 119 CCCCACCCCACGCACTAGCTTGTGTATGGGCTCGGCTCGCTGCGGGCTCC tions, provided from Mouse Genome Database

+169 GGGTTCGGCTGCACTTGCTCCCCAGTCGTGCGCCAGGGGCTTTTTGTGGC (The Jackson Laboratory, Bar Harbor, ME). Al- though several mutations lie in the region of +219 GGGGGCCGGACCTGGGTTTTGTGTCCCGAGTTCTGTGCCGTGCCCGCGGC Tmpo, none have a phenotype consistent with +269 CGC CGC TCCGGGGCAGGGATC TCCCGAGGCGGCGGGCGCAGCCCGGGACC what might be predicted for a mutation in Tmpo. +319 AGCGAGCGAGCGCGCCGGCG TGAGCAGCGGCGGCGGCGAC TG TGAGGGGC LAP2 is an integral membrane protein of the +369 CGGGGAGGAGAGAAGGAGGGGGACGAGATGCC GGA~TT CC TAGAGGAC M P E F L E D inner nuclear membrane, which binds directly to Figure 5 Sequence of the 5' end of the mouse both and in a mitotic Tmpo gene. ($) The transcription initiation site phosphorylation-regulated manner (Foisner and (nudeotide + 1). The conserved promoter se- Gerace 1993). The biochemical and physiological quences between mouse and human are in boldface properties of LAP2 suggest an important role in type. Sequences identical to binding site sequences nuclear envelope reassembly at the end of mitosis for known transcription factors are underlined, and and/or anchoring of the nuclear lamina and in- the names of the transcription factor are given above. The sequence has been deposited with Gen- Bank under accession no. U38185. Murme TP~ PEFLEDPSVLTKDKLKSELVANNVTLPA~DVYVQLYLQHLT 45 Rat LAP2 ...... Human TPI~ ......

Murine TPI~ ARNRPPLAAGANSKGPPDFSSDEERDATPVLGSG. ASVGRGRGAVG 90 mRNA is via control of its 3'-end cleavage and Rat LAP2 ...... E~ ...... polyadenylation site before splicing occurs (McK- Human TPI~ ...... P--T ...... EP ...... A-AA--S-A---

eown 1992). This would eliminate all of the Murtne TPI3 RKATKKTDKPRLEDKDDLDVTELSNEELLDOLVRYGVNPGPIVG-IT136 Rat LAP2 ...... P E downstream exons encoding Tmpo [3, e, 5, and 7 Human TP~ ...... Q ...... T--D ...... L ......

sequences and remove them as competitors for Murin¢ TP~ RKLYEKKLLKLREQGTESRSSTPLPTVSSSAENTRQNGSNDSDRYS 182 splicing of the exon 3 to the c~-specific exon 4. Rat LAP2 ...... A ...... Human TP~ ...... I ...... Harris et al. (1995) demonstrated a conserved se- Munne TP~ DNDEDSKIELKLEKREPLKGRAKTPVTLKQRRTEHNQSySOAGVTE228 quence downstream of the two alternative poly- Rat LAP2 ...... I ...... E ..... adenylation signals for human TMPOoL mRNA, Human TP~ --E ...... V ...... I-- suggesting that this sequence may be a binding Mmine TP~ TEWTSGSSTGOPLQALTRESTRGSRRTPRKRVETSQHFRIDGAVIS 274 Rat LAP2 ...... K ...... R---P ..... V ...... site for a factor that regulates TMPOoL mRNA 3'- Human TPJ~ ...... K ...... E ...... P--- end formation, perhaps in a tissue-specific man- Mufin¢ TP~ ESTPIAETIKASSNESLVANRLTGNFKHASSILPITEFSDITRRTp 320 ner (Keller 1992). The mouse Trnpo~ mRNA se- Rat LAP2 ...... D ...... Human TP~ ...... &--M ...... V--V ...... P ...... P--A- quence (Fig. 1C) shows two further polyadenyla- Murine TP~ KKPLTRAEVGEKTEERRVDRDILKEMFPYEASTPTGISASCRRPIK 366 tion sites, but the regulation mechanism of the Rat LAP2 ...... E ...... Tmpo~ mRNA 3'-end formation has not yet been Human TP~ ...... E ...... identified. Murine TPI~ GAAGRPLELSDFRMEESFSSKYVPKYAPLADVKSEKTKKRRSVPMW 412 Rat LAP2 ...... V ...... G ...... The 5' end of the Tmpo gene is GC-rich, a Human TP~ ...... V ...... G--I-V-

characteristic feature of many genes with a wide Murin¢ TP[~ IKMLLFALGACFLFLVYQAMETNQGNPFTNFLQ. DTKISN 451 Rat LAP2 ...... V-G ...... range of tissue expression. Comparison of mouse Human TI'~ --I---VVV-V ...... V---S---HV-PRK-- and human Tmpo promoter sequences reveals Figure 6 Comparison of the full-length predicted several common conserved sequences, around amino acid sequences of the mouse Tmpo[3, rat the predicted transcription start site, in a similar LAP2, and human TMPO[3. (Dashes represent iden- order. These conserved sequences contain poten- tical amino acids). Numbers represent the mouse tial binding sites for known transcription factors, TP[3 amino acids.

GENOME RESEARCH @ 367 Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

BERGER ET AL.

mD Dm Dm to the human TMPO[3, indicating that they are structurally and functionally homologous pro- r.,o mD mD Dm teins. Hence, Tmpo c~, e, 8, %and ~ are the LAP2 gene alternatively spliced forms. That assump- • mD mD mD tion is supported further by Western blot analysis 63 87 2 2 13 11 using LAP2-specific antibodies, which yielded bands at 53/43/41 kD (Konstantinov et al. 1995). The 53-kD polypeptide corresponds to the pre- dicted molecular mass of the Tmpo~/LAP2 pro- 10 tein. Based on the calculated molecular mass of II the various Tmpo isoforms, we suggest that the 43/41-kD bands are the alternatively spliced 8 and ~ forms, respectively. Like Tmpo[3/LAP2, the e, 8, and ~ Tmpo contain a single putative mem- -- lgfl 12q22-q23 brane-spanning sequence (Fig. 1B) (Furakawa et 41181, 2.24- 1.1 al. 1995), which suggests that these alternatively -- Tmpo 12q22 spliced isoforms are three additional integral membrane proteins of the inner nuclear mem- brane. By expressing deletion mutants of LAP2 in 26/185, 14.14- 2.6 cultured cells, it was found that the smallest nu- cleoplasmic fragment of LAP2 that can specify binding to components associated with the nuclear envelope are residues 244-398 (Furakawa -- Mgf 12q22 et al. 1995). However, mouse Tmpo e, 8, and ~ are identical to the Tmpo~/LAP2 except that they f, 3 lack residues 220-259, 220-291, and 221-328, re- spectively. Therefore, Tmpo[3/LAP2 and Tmpo e, Figure 7 Chromosomal location of Tmpo in the 8, and ~ may have different binding potencies to mouse genome. The locus was mapped by interspe- the various components associated with the cific backcross analysis. The segregation patterns of nuclear envelope. Tmpo and flanking genes are shown at the top. Each Two of the most favored cdc2 kinase sites in column represents the chromosome identified in LAP2 (Tmpo[3) are found at residues 256-259 and the backcross progeny that was inherited from the 320-323 (Furakawa et al. 1995). Phosphorylation (C57BL/6J × M. spretus) F1 parent. The solid boxes represent the presence of a C57BL/6J allele; open of these sites by cdc2 kinase may be involved in boxes represent the presence of M. spretus allele. modulating the interactions of LAP2 with chro- The number of offspring inheriting each type of matin and lamin during mitosis. However, Tmpo chromosome is listed beneath each column. A par- and 8 lack the cdc2 kinase site positioned at tial chromosome 10 linkage map showing the loca- residues 256-259, whereas Tmpo ~ is missing tion of Tmpo in relation to linked genes is shown at both sites. These findings propose a possible role the bottom. The number of recombinant N2 animals for the alternative splicing mechanism in nuclear over the total number of N2 animals typed plus the events and cell cycle regulation. recombination frequencies, expressed as genetic distance (in cM) (_ 1 s. e.) is shown for each pair of loci (left). The positions of loci in human chromo- METHODS somes are shown at right. References for the human Isolation of Mouse Trnpo cDNA Clones map positions of loci cited in this study can be ob- tained from Genome Data Base (GDB), a comput- A cDNA clone, designated Tmpo~ was isolated from a erized data base of human linkage information mouse (B6/CBAFJ females, 6-8 weeks old) thymic cDNA maintained by the William H. Welch Medical Library library (commercially purchased from Strategene). The ini- of The Johns Hopkins University (Baltimore, MD). tial probe used was a 126-bp fragment encoding Tmpo amino acids 1-42 from the bovine cDNA clone 113 (Zevin- Sonkin et al. 1992). Hybridization was performed with terphase chromosomes to the nuclear envelope. 35% formamide, 5 × SSPE (Sambrook et al. 1989) at 42°C, The amino acid sequence of the rat LAP2 is 96% with the highest stringency wash in 0.1 × SSPE at 50°C. In identical to the mouse Tmpo[3 and 91% identical the subsequent round of screening, a probe derived from

368 @ GENOME RESEARCH Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

MOLECULAR CHArACTErIZATION OF MOUSE THYMOPOIETIN GENE the initial mouse clone was used. The hybridization was in Interspecific Mouse Backcross Mapping 0.5 M NaHPO 4 (pH 7.2), 7% SDS, at 65°C, and the highest stringency wash was in 0.04 M NaHPO 4 (pH 7.2), 1% SDS, Interspecific backcross progeny were generated by mating at 65°C. All sequences reported were determined on both (C57BL/6J X M. spretus) F 1 females and C57BL/6J males as strands by the Sanger technique using Sequenase version described (Copeland and Jenkins 1991). A total of 205 N2 2.0 kit (Amersham). mice were used to map the Tmpo locus (see text for details). Southern blot analysis was performed as described (Jenkins et al. 1982). All blots were prepared with Hybond-N ÷ mem- Isolation and Analysis of Genomic Clones brane (Amersham). The Tmpo probe, a PCR-amplified frag- ment from the Tmpo mouse cDNA, was labeled with Four overlapping partial genomic clones were isolated [~32p]dCTP using a random priming labeling kit (Strata- from a library, prepared from a BALB/c liver DNA Sau3A- gene); washing was done to a final stringency of digested, and cloned into the EMBL 3A vector. The library 0.5 x SSCP, 0.1% SDS at 65°C. A fragment of 4.8 kb was was screened with the different cloned mouse Tmpo cD- detected in XbaI-digested C57BL/6J DNA and a fragment NAs. Clones were characterized initially by partial restric- of 4.0 kb was detected in XbaI-digested M. spretus DNA. tion mapping and by hybridization with defined regions The presence or absence of the 4.0-kb M. spretus-specific of Tmpo cDNAs. Fragments of interest were subcloned into fragment was followed in the backcross mice. pBluescript II SK(+) (Stratagene) for DNA sequencing. The A description of the probes and RFLPs for the loci intron-exon boundaries were determined by direct com- linked to Tmpo, including insulin-like growth factor-1 parison of the nucleotide sequences of Tmpo cDNAs clones (Igfl) and mast cell growth factor (Mg~) has been reported and genomic sequences, using sense and antisense primers previously (Copeland et al. 1990). Recombination dis- within the different cDNAs regions. Some of the sequenc- tances were calculated as described (Green 1981) using the ing primers were designed considering the alternatively computer program SPRETUS MADNESS. Gene order was spliced pattern. The sizes of introns were confirmed by determined by minimizing the number of recombination restriction endonuclease . events required to explain the allele distribution patterns.

Isolation of the Human Exon 6 and its ACKNOWLEDGMENTS Intronic Sequences We thank Dr. Eitan Friedman for his helpful and critical comments regarding this work, and Biana Freits for excel- PCR was performed using a DNA thermal cycler. The fol- lent technical assistance. We thank D. Swing and D.J. Gil- lowing human TMPO~ oligonucleotide primers were syn- bert for excellent assistance. This research was supported, thesized: 5'-AGCTATTCTCAAGCTGGAA-3', sense nucleo- in part, by the National Cancer Institute, Department of tides 661-680 and 5'-TTCAGCTCTTGTCAATGG-3', anti- Health and Human Services, under contract with ABL sense nucleotides 987-970 (Harris et al. 1994). A 50-~1 (N.G.C., N.A.J., and K.B.A.) and by a grant from the Na- reaction mixture containing 500-1000 ng of placental ge- tional Institutes of Health, National Research Service nomic human DNA in 50 mM KC1, 10 mM Tris-HC1 (pH Award 5F32GM15909-02 from the National Institute of 8.3), 1.5 mM MgC12, 0.001% gelatin, 250 mM (each) dNTPs: General Medical Sciences (K.B.A.). The work reported is dATP, dCTP, dGTP, dTTP, 20 pmoles of each of the prim- part of the Ph.D. thesis to be submitted by R.B. to the ers, and I unit of Taq DNA polymerase was subjected to 35 Sackler Faculty of Medicine, Tel Aviv University. The se- cycles of amplification. PCR conditions was follows: I min quence data described in this paper have been submitted at 94°C, 1 min at 50°C, and 1.5 min at 72°C with a final to the GenBank data library under accession nos. elongation step at 72°C for 10 min. The human sequences U39073-U39078 and U38185. were obtained using the ALFexpress automatic DNA se- The publication costs of this article were defrayed in quencer (Pharmacia Biotech). part by payment of page charges. This article must there- fore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact. Sequence Analysis Computer analysis of DNA and protein sequences were REFERENCES done using the Genetics Computer Groups (GCG) soft- ware package (Genetics computer group 1991). Sequences Altschul, S.F., W. Gish, W. Miller, E.W. Myers, and D.J. were analyzed for homology to nucleic acid and protein Lipman. 1990. Basic local alignment search tool. J. Mol. data bases using the Blast program (Altschul et al. 1990) of Biol. 215: 403-410. the National Center for Biotechnology Information via the Internet. Audhya, T. and G. Goldstein. 1988. Amino acid Analysis of the 5'-flanking region of the gene for can- sequence of thymopoietin isolated from skin. Ann. N.Y. didate sequences similar to known binding sites for tran- Acad. Sci. 548: 233-240. scriptional regulatory proteins, and for the recognition of Audhya, T., D.H. Schlesinger, and G. Goldstein. 1981. the start of transcription site, was done with the TSSG Complete amino acids sequences of bovine program (Prestridge 1995) using the TFD transcription fac- Thymopoietins I, II, and III: Closely homologous tor data base (Ghosh 1991). polypeptides. Biochemistry 20" 6195-6200. The sequences described here have been deposited in the GenBank data base (accession nos. U38185 and Berger, R., L. Theodor, F. Brok-Simoni, H. Ben-Bassat, L. U39073-U39078). Trakhtenbrot , J. Shoham, and G. Rechavi. 1995.

GENOME RESEARCH @ 369 Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

BERGER ET AL.

Demonstration of Thymopoietin transcripts in different Kadonaga, J.T., K.A. Jones, and R. Tjian. 1986. hematopoietic cell lines. Acta Haematol. 93: 62-66. Promoter-specific activation of RNA polymerase II transcription by Spl. Trends Biochem. Sci. 11- 20-23. Conant, M.A., L.H. Calabrese, S.E. Thompson, BJ. Poiesz, S. Rasheed, R.L. Hirsch, L.A. Meyerson, A.B. Kalderon, D., B.L. Roberts, W.D. Richardson, and E.A. Kremer, C.C. Wang, and G. Goldstein. 1992. Smith. 1984. A short amino acid sequence able to specify Maintenance of CD4 ÷ cells by thymopentin in nuclear location. Cell 39" 49%509. asymptomatic HIV-infected subjects: Results of a double-blind, placebo-controlled study. AIDS 6: 1335-1339. Kantharia, B.K., N.J. Goulding, N.D. Hall, J. Davies, P.J. Maddison, P.A. Bacon, M. Farr, J.A. Wojtulewski, K.M. Englehart, S.P. Liyanage, and N.L. Cox. 1989. Copeland, N.G. and N.A. Jenkins. 1991. Development Thymopentin (Tmpo-5) in the treatment of rheumatoid and applications of a molecular genetic linkage map of arthritis. Br. J. Rheumatol. 28- 118-123. the mouse genome. Trends Genet. 7: 113-118.

Copeland, N.G., D.J. Gilbert, B.C. Cho, PJ. Donovan, Keller, W. 1992. The biochemistry of 3'-end cleavage N.A. Jenkins, D. Cosman, D. Anderson, S.D. Lyman, and and polyadenylation of messenger RNA precursors. Annu. D.E. Williams. 1990. Mast cell growth factor maps near Rev. Biochem. 61: 419-440. the Steel locus on mouse chromosome 10 and is structurally altered in a number of Steel alleles. Cell Konstantinov, K., R. Foisner, D. Byrd, F.-T Liu, W.-M 63: 175-183. Tsai, A. Wiik, and L. Gerace. 1995. Integral membrane proteins associated with the nuclear lamina are novel Foisner, R. and L. Gerace. 1993. Integral membrane autoimmune antigens of the nuclear envelope. Clin. proteins of the nuclear envelope interact with lamins Imunnol. Immunopathol. 74" 89-99. and chromosomes, and binding is modulated by mitotic phosphorylation. Cell 73: 1267-1279. McKeown, M. 1992. Alternative mRNA splicing. Annu. Rev. Cell Biol. 8: 133-155. Furukawa, K., N. Pante, U. Aebi, and L. Gerace. 1995. Cloning of a cDNA for lamina-associated polypeptide 2 Nigg, E.A. 1993. Cellular substrates of p34 cac2 and its (LAP2) and identification of regions that specify companion cyclin dependent kinases. Trends Cell Biol. targeting to the nuclear envelope. EMBO J. 3: 296-301. 14: 1626-1636. Patschinsky, T., T. Hunter, F.S. Esch, J.A. Coo, and B.M. Genetics Computer Group. 1991. Program manual for Seft. 1982. Analysis of the sequence of amino acids the GCG package, version 7, April, 1991, Madison, WI. surrounding sites of tyrosine phosphorylation. Proc. Natl. Acad. Sci. 79: 973-977. Ghosh, D. 1991. New developments of a transcription factors data-base. Trends Biochem. Sci. 16: 445-447. Prestridge, D.S. 1995. Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Goldstein, G. 1974. Isolation of bovine thymin: A Biol. 249" 923-932. polypeptide of the thymus. Nature 247: 11-14.

Sambrook, J., E.F. Fritsch, and T. Maniatis. 1989. Green, E.L. 1981. Linkage, recombination and mapping. Molecular cloning: A laboratory manual, Cold Spring In Genetics and probability in animal breeding experiments Harbor Laboratory Press, Cold Spring Harbor, NY. pp. 77-113. Oxford University Press, New York, NY.

Harris, C.A., P.J. Andryuk, S. Cline, H.K. Chan, A. Santoro, C., N. Mermod, P.C. Andrews, and R. Tjian. Natarajan, J.J. Siekierka, and G. Goldstein. 1994. Three 1988. A family of human CCAAT-box-binding proteins distinct human thymopoietins are derived from active in transcription and DNA replication: Cloning and alternatively spliced mRNAs. Proc. Natl. Acad. Sci. expression of multiple cDNAs. Nature 334: 218-224. 91: 6283-6287. Schlesinger, D.H. and G. Goldstein. 1975. The amino Harris, C.A., P.J. Andryuk, S. Cline, S. Mathew, J.J. acid sequence of thymopoietin II. Cell 5: 361-365. Siekierka, and G. Goldstein. 1995. Structure and mapping of the human Thymopoietin (Tmpo) gene and Zevin-Sonkin, D., E. Ilan, D. Guthmann, J. Riss, L. relationship of human Tmpo [3 to rat lamin-associated Theodor, and J. Shoham. 1992. Molecular cloning of the polypeptide 2. Genomics 28: 198-205. bovine thymopoietin gene and its expression in different calf tissues: Evidence for a predominant expression in Jenkins, N.A., N.G. Copeland, B.A. Taylor, and B.K. Lee. thymocytes. Immunol. Lett. 31: 301-310. 1982. Organization, distribution and stability of endogenous ecotropic murine leukemia virus DNA sequences in chromosomes of Mus. musculus. J. Virol. Received January 12, 1996; accepted in revised form March 43: 26-36. 29, I996.

370 @ GENOME RESEARCH Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

The characterization and localization of the mouse thymopoietin/lamina-associated polypeptide 2 gene and its alternatively spliced products.

R Berger, L Theodor, J Shoham, et al.

Genome Res. 1996 6: 361-370 Access the most recent version at doi:10.1101/gr.6.5.361

References This article cites 26 articles, 3 of which can be accessed free at: http://genome.cshlp.org/content/6/5/361.full.html#ref-list-1

License

Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here.

To subscribe to Genome Research go to: https://genome.cshlp.org/subscriptions

Copyright © Cold Spring Harbor Laboratory Press