Proc. Natl. Acad. Sci. USA Vol. 85, pp. 1615-1619, March 1988 Immunology Exon- organization and sequence comparison of human and murine Til (CD2) (genomic maps/molecular cloning/T-cell receptors/DNA ) DON J. DIAMOND*t, LINDA K. CLAYTON*t, PETER H. SAYRE*, AND ELLIS L. REINHERZ*t *Laboratory of Immunobiology, Dana-Farber Cancer Institute and Departments of tPathology and tMedicine, Harvard Medical School, 44 Binney Street, Boston, MA 02115 Communicated by Barry R. Bloom, November 2, 1987 (receivedfor review September 23, 1987)

ABSTRACT Genomic DNA clones containing the human mic domain rich in prolines and basic residues. cDNAs and murine genes coding for the 50-kDa T11 (CD2) T-cell encoding the murine and rat equivalents of T11 have been surface glycoprotein were characterized. The human T11 cloned and shown to define a homologous structure bearing is =12 kilobases long and comprised of five exons. A leader a 50% overall identity at the level (12-14). In exon (L) contains the 5'- and most of the contrast to immunoglobulins whose variable and constant nucleotides defining the signal peptide [amino acids (aa) -24 region domains consist of antiparallel (3-sheets, protein mod- to -5]. Two exons the extracellular segment; exon Exi eling of murine and human T11 external segments indicates is 321 base pairs (bp) long and codes for four residues of the that these structures most likely belong to an a/,B-protein leader peptide and aa 1-103 of the mature protein, and exon folding class (12). Furthermore, the cytoplasmic region of Ex2 is 231 bp long and encodes aa 104-180. Exon TM is 123 bp T11 in each species is predicted to have a nonglobular long and codes for the single transmembrane region of the conformation and is sufficiently elongated to allow for po- molecule (aa 181-221). Exon C is a large 765-bp exon encoding tential interactions with multiple other intracellular proteins. virtually the entire cytoplas'mic domain (aa 222-327) and the It is thus likely that the cytoplasmic region subserves a signal 3'-untranslated region. The murine T11 gene has a similar transduction function. organization with exon-ntron boundaries essentially identical Here we report on the isolation and characterization of to the human gene. Substantial conservation of nucleotide human and murine T11 genes. DNA sequences between species in both 5'- and 3'-gene flanking demonstrates that the human and murine exon-intron orga- regions equivalent to that among homologous exons suggests nization is virtually identical.§ Of note, the extracellular that murine and human genes may be regulated in a similar segment of the mature T11 protein is encoded by two exons, fashion. The probable relationship of the individual T11 exons whereas almost the entire intracellular segment and 3'- to functional and structural protein domains is discussed. untranslated region is encoded on a single exon. The 50-kDa T11 (CD2) surface glycoprotein plays a major role in T-lymphocyte function (1-5). Monoclonal MATERIALS AND METHODS directed against an epitope on the external segment of the Genomic Library Screening and Restriction Mapping. A human T11 molecule (T111) block T-lymphocyte activation, human peripheral blood lymphocyte genomic library (kindly including that mediated through the T-cell receptor for provided by S. Orkin, Children's Hospital, Boston, MA) antigen and major histocompatibility complex (Ti-T3) (6). In containing DNA partially digested with Sau3aI and ligated contrast, antibodies defining two other extracellular segment into the EMBL3A vector (15) was screened with a 335-base- epitopes (T112 and T113) in concert give rise to antigen- independent human T-cell activation (1-3). Thus, interaction pair (bp) 5' fragment of the human T11 cDNA (PB2), ending of specific monoclonal antibodies with the T11 molecule at the unique EcoRV site (positions 8-335), and with a produces either profound antagonistic or agonistic effects on 401-bp 3' fragment ending at the second internal Taq I site T lymphocytes in vitro. The ability of anti-T111 monoclonal (positions 617-1017) (9). Genomic fragments containing re- antibodies to inhibit T-lymphocyte activation is predicated, gions that hybridized to PB2 cDNA were cloned into pUC18 at least in part, on their capacity to abrogate T11-mediated for further restriction analysis. Exons were identified within cell-cell contact (7). In fact, the spontaneous interaction the human Til gene through hybridization with kinase- between human T lymphocytes and sheep erythrocytes is treated 17-base oligonucleotides whose sequences were de- but one manifestation of the role of T11 in facilitating rived from the human cDNA. All mapping distances are cell-cell adhesion (8). accurate to within 100 bp. Oligonucleotide synthesis em- To more precisely define the of this func- ployed standard cyanoethyl phosphoramidite chemistry on tionally important molecule, we and others have elucidated an Applied Biosystems model 381A (Applied Biosystems, the complete primary structure of the human T11 molecule Foster City, CA). by protein microsequencing and cDNA cloning (9) or by BALB/c mouse liver DNA was partially digested with selection of expressed cDNA clones (10, 11). The Mbo I, and the size-selected 15- to 20-kilobase (kb) frag- following three segments of the protein have been defined: ments were ligated into the EMBL3B vector and propagated (i) an extracellular segment comprising more than half of the in the host bacterium LE392 to produce a murine genomic molecule and bearing only limited homology to members of library. The 819-bp 5' EcoRI fragment of the murine cDNA the immunoglobulin gene superfamily including the T4 T-cell XB2 (12) was used to screen the library for murine T11 surface protein (9, 10); (ii) a single hydrophobic transmem- brane segment; and (iii) a lengthy 117-amino acid cytoplas- Abbreviation: aa, amino acid(s). §These sequences reported in this paper are being deposited in the EMBL/GenBank data base (Bolt, Beranek, and Newman Labora- The publication costs of this article were defrayed in part by page charge tories, Cambridge, MA, and Eur. Mol. Biol. Lab., Heidelberg) payment. This article must therefore be hereby marked "advertisement" (accession no. J03622 for the human T11 gene and J03623 for the in accordance with 18 U.S.C. §1734 solely to indicate this fact. murine T11 gene). 1615 Downloaded by guest on September 26, 2021 1616 Immunology: Diamond et al. Proc. Natl. Acad. Sci. USA 85 (1988) sequences. Restriction analysis of the phage DNA and the coding segments of T11 overlapped the derived phage subcloning of hybridizing fragments was performed as clones such that the 5' and 3' ends of the corresponding above. cDNA were located on separate phage inserts (Fig. 1). Sequence Analysis. Fragments obtained by electroelution Results of restriction analysis and DNA sequencing of from agarose gels of bacteriophage A DNA digests were human and murine T11 genes indicate that, in each case, five subcloned into bacteriophage M13mpl8 or -mpl9 sequenc- exons are encoded within a gene -12 kb long (Fig. 1). The ing vectors by standard procedures (16). Sequence analysis exons are separated by that vary from 0.1-3.9 kb was performed on these clones by the dideoxy chain- long. Both the sizes and positions of the exons and introns termination procedure of Sanger et al. (17) with either are extremely similar between the human and murine T11 [a-32P]dATP or adenosine 5'[a-35S]thiotriphosphate (dATP genes (see Table 1). This conservation of the length of the [35S]) as a single radioactive nucleotide. In some cases, individual exons and introns and their relative positions sequence information was obtained from fragments cloned suggests that both genes evolved similarly to encode pro- into pUC18 by utilizing avian myeloblastosis virus reverse teins with homologous function. Comparison of the murine transcriptase (New England Nuclear) on double-stranded and human genomic copies of interleukin 2 (19), the 6 DNA as described by Chen and Seeburg (18). All 5'- and subunit of T3 (20), the a subunit of the T-cell receptor Ti (21, 3'-flanking sequences and coding regions were sequenced on 22), and the f8 subunit of the T-cell receptor Ti (23, 24) also both strands. revealed similar exon-intron organizations. These genes presumably have similar functions in both species. The AND DISCUSSION T-cell lineage restricted distribution of the human and mu- RESULTS rine T11 mRNA, shown in RNA gel blot analysis (9, 10), Isolation of Recombinant Phage Containing the Human and provides further evidence for the similarity of function of Mouse Tl1 (CD2) Gene Segments. A genomic library con- these two genes. Chromosomal locations for the murine and structed in the A vector EMBL3A (15) by partial Sau3aI human T11 genes (ref. 13 and L.K.C. and E.L.R., unpub- restriction endonuclease digestion of human peripheral lished results) have been shown to be on chromosomes 3 and blood lymphocyte DNA was screened with 5'- and 3'- 1, respectively. These chromosomes are known to be syn- specific fragments from the human T11 cDNA. Three types tenic, which further implies a common origin and evolution of phage were recovered from the library that encompassed for the murine and human T11 genes. all of the coding segments from the known cDNA sequence Nucleotide Sequence Analysis. By using a series of 17-base (9) as well as extensive 5'- and 3'-flanking regions (Fig. 1). oligonucleotide primers generated from the sequence of the The 5' fragment detected two classes of phage, such as murine and human T11 cDNAs, the exon-intron structure of ABT2D2 and ABT2; whereas the 3' fragment detected ABT2 the corresponding T11 genes was determined (Fig. 2). The and ABT1 (see Fig. 1). The genomic copy of the murine nucleic acid sequence of each exon bounded by its canonical homologue of T11 was obtained in a similar fashion by 5'- and 3'-splice junctions (26) is shown in Fig. 2. There is a probing a murine genomic partial library constructed in the A high degree of similarity between the boundaries of the vector EMBL3B with cDNA fragments (12). Once again, respective human and murine exons, such that in every case three overlapping phage inserts were obtained that corre- a precisely homologous codon is split between the first and sponded to the complete coding sequence of the cDNA as second nucleotide in the corresponding exon boundary (see well as extensive 5'- and 3'-flanking regions (Fig. 1). The 5' Fig. 2). For the T-cell receptor subunit genes (21-24), portion of the gene was contained within AYH2, whereas immunoglobulin genes (27), and T3 6-subunit genes (20), a AYH16 and A8B contained the 3' segments (Fig. 1). codon triplet is also split between the first and second Structural Comparisons of the Human and Murine Homo- nucleotides by an intronjunction. However, introns ofgenes logue of the Tl1 Gene. Southern blotting experiments with such as y interferon (28), human (29), and mouse interleukin oligonucleotide probes on the isolated recombinant phage 2 (19) split exons between codons. Table 1 shows that a from the human and mouse genomic libraries revealed that codon yielding an identical amino acid is present (e.g., Human T-1 1

L Ex1 Ex2 TM C Im m I I SR Pv S Bg At B Bg S Bg B SmN Sm S XhH S B

XBT2- I

I - AXBT2D2 ABT 1 - I FIG. 1. Partial restriction maps of the hu- man (Upper) and murine (Lower) T11 (CD2) Murine T-1 1 genes. Cleavage sites of restriction enzymes are as follows: A, Ava II; B, BamHI; Bg, Bgl II; E, L Ex1 Ex2 TM C I. I EcoRI; H, HindIII; Hp, Hpa I; N, Nde I; P, Pst I; Pv, Pvu II; R, EcoRV; S, Sac I; Sm, Sma I; B E EH X E P X E Hp HIp X P P XE X H X, Xba I; and Xh, Xho I. The dark boxes above 1 4 I W 1 i1 Xi W l the restriction maps represent exons L, Ex1, Ex2, TM, and C. The exons correspond to the following positions in the human P131 cDNA 1- XYH2 sequence (9): bp 1-84 (exon L), bp 85-405 (exon Exi), bp 406-636 (exon Ex2), bp 637-759 AYH16 -C (exon TM), and bp 760-1524 (exon C). The A8B- original EMBL3 recombinant phage used to subclone the respective genes are shown as 1 Kb open boxes under the restriction map. Downloaded by guest on September 26, 2021 Immunology: Diamond et al. Proc. Natl. Acad. Sci. USA 85 (1988) 1617

190 luLysGlyLeuAspIleTyrLeuIleI leGlyIleCysCly -h A( ( T(.CTTAA(.CTCTCG(;GT:GTCT:GACTCCACCAGTCTCACTTCAGTTCCTTTTGCATTAAGAGCTCAG .CTCTCTTCCCTTTGCAGAGAAAGGTCTGGACATCTATCTCATCATTGGCATATGTGGA m ACGTGC-t-TTCA(;CTTCCTGCeGTCTCGCTTCTGCCACCCTCACCACAG-TCC--TGACAGAAGMCTCAG .TTTTCTTCTTGCAGAGAAAGGTCTGTCCTTCTATGTCACAGTCGGGGTCGGTGCA luLysGlyLeuSerPheTyrValThrValGlyValGlyAla -24 200 210 METSerPheProCysLysPheValAlaSerPheLeuLeuIle GlyGlySerLeuLeuMetValPheValAlaLeuLeuValPheTyrIleThrLysArgLysLysClnArgS AATCA-AAAGAGGAAACCAACCCCTAACATGAGCMCCATGTAAM GTAGCCAGCTTCCTTCTGATT GGAGGCACCCTCTTGATGGTCTTTGTGCCACTGCTCGTTTTCTATATCACCAAAGGAAAAAAACAGAGGA ***** ******************** * ********** *********** ** * ACTCACCCCTGGGAAAAGAACTCTAAAGATGA ------AATGTAAATTCCTGCGTAGCTTCTITCTGCTC GGAGGACTCCTCTtGGTCCTCTTGGTGGCGCTTTTTATTTTCTGTATCTGCAAGAGGAGAAAACGGAACA TM METL ysCysLysPheLeuGlySerPhePheLeuLeu GlyGlyLeuLeuLeuValLeueuValAlaLeuPheIlePheCysIleCysLysAr8ArgLysArSAsrLA -10 220 PheAsnValSerSerLysG erArgAr&AsnA TTCAATGTTTCTTCCAAAGGTAAG GTCGGAGAAATGGTAAG

TTCAGCCTTTCCGGCAAAGGTAAG ...... GGAGGAGAAAAGGTAAG ...... PheSerLeuSerGlyLysG +1 rgArgArgLysA lyAlaValSerLysG ...... CTCTTTTGCTTTTTATAGGTGCAGTCTCCAAMG ...... TATTGAGCTT

...... TCTTACTTTTTTACAGGGGCGGACTGCAGAG ...... TTAAAACGTC lyAlaAspCysArgA 10 20 230 240 luI leThrAsnAlaLeuGluThrTrpGlyAlaLeuGlyGlnAspIleAsnLeuAspIleProSerPheGl AGATTACGAATGCCTTGCAAACCTGGGGTGCCTTGGGTCAGGACATCAACTTGGACATTCCTAGTTTTCA

ACAATGAGA - -- CC ------ATCTGGGGTGTCTTGGGTCATGGCATCACCCTGAACATCCCCAACTTTCA TTGCCATTATAGATGAAGAGCTGGAAATAAAAGCTTCCAGAACAAGCACTGTGGAAAGGGGCCCCAAGCC spAsnGluT hr IleTrpGlyValLeuGlyHisGlyIleThrLeuAsnIleProAsnPheGl spGluGluLeuGluIleLysAlaSerArgThrSerThrValGluArgGlyProLysPr 30 40 250 260 nMetSerAspAspIleAspAspIleLysTrpCluLysThrSerAspLysLysLysIleAlaGlnPheArg oHisGlnIleProAlaSerThrProGlnAsnProAlaThrSerGlnHisProProProProProGlyHis AATGAGTGATGATATTGACCATATAAAATGGGAAAAAACTTCAGACAAGAAAAAGATTGCACAATTCAGA CCACCAAATTCCAGCTTCAACCCCTCAGAATCCAGCAACTTCCCAACATCCTCCTCCACCACCTGGTCAT

AATCACTCATGATATTGATGAGGTGCGATGGGTAAGGA --- GGGGC -ACCCTGGTCGCAGAGTTTAAA GCACTCAACCCCAGCCCGCAGCAGCGCAGAATTCAGTGGCGCTCCAA - -- GCTCCTCCTCCACCTGGCCAT nMetThrAspAspIleAspGluValArgTrpValArgA rgGly ThrLeuValAlaGluPheLys oHisSerThrProAlaAlaAlaAlaGlnAsnSerValAlaLeuGln AlaProProProProGlyHis Exl 50 60 70 270 280 LysGluLysGluThrPheLysGluLysAspThrTyrLysLeuPheLysAsnGlyThrLeuLysIleLysH ArgSerGlnAlaProSerHisArgProProProProGlyHisArgValGlnHisGlnProGlnLysArgP AAAGAGAAAGAGACTTTCAAGGAAAAAGATACATATAAGCTATTTAMTGGAACTCTGAAAATTAAGC CGTTCCCAGGCACCTAGTCATCGTCCCCCGCCTCCTGGACACCGTGTTCAGCACCAGCCTCAGAAGAGGC AGGAAGAAGCCACCTTTTTTGATATCAGAAACGTATGAGGTCTTAGCAAACGGATCCCTGAAGATAAAGA CACCTCCAGACACCTGGCCATCGTCCCTTGCCTCCACGCCACCGTACCCGTGAGCACCAGCAGAAGAAGA ArgLysLysProProPheLeuI leSerGluThrTyrGluValLeuAlaAsnGlySerLeuLysIleLysL HisLeuGlnThrProGlyHisArgProLeuProProClyHisArgThrArgGluHisGlnGlnLysLysA 80 90 290 300 310 isL euLysThrAspAspGlnAspIleTyrLysValSerIleTyrAspThrLysGlyLysAsnValLe roProAlaProSerGlyThrGlnValHisGlnGlnLysGlyProProLeuProArgProArgValGlnPr ATC --- TCAAGACCGATGATCAGCATATCTACAAGGTATCAATATATGATACAAAAGGAAAAAATGTGTT CTCCTGCTCCGTCGGGCACACAAGTTCACCAGCAGAAAGGCCCGCCCCTCCCCAGACCTCGAGTTCAGCC C ACCCCATGATGAGAAACGACAGTGGCACCTATAATGTAATGGTGTATGGCACAAATGGGATGACTAGGCT GACCTCCTCCATCAGGCACACACATTCACCAGCAGAAAGGCCCTCCTTTACCCAGACCCCGAGTTCACCC ysProMetMetArgAsnAspSerGlyThrThrAsnValMetValTyrGlyThrAsnGlyMetThrArgLe rgProProProSerGlyThrGlnlleHisGlnGlnLysGlyProProLeuProArgProArgVa1G1nPr 100 320 uGluLysIlePheAspLeuLysIleGInG oLysProProHisGlyAlaAlaGluAsnSerLeuSerProSerSerAsn GCAAAAAATATTTGATTTGAAGATTCAAGGTAAG ...... AAAACCTCCCCATGGGGCAGCAGMAACTCATTGTCCCCTTCCTCTAATTAAAA.AGATAGAAACTGTCT

GGAGAAGGACCTGGACGTGAGGATTCTGGGTAAG ...... AAAACCTCCCTGTGGGAGTGGAGATGGTGTTTCACTGCCGCCCCCTAATT-AAGAAGGCAGAGTTCGTCA _ uoluLysAspLeuAspValArgI leLeuG oLysProProCysGlySerGlyAspGlyValSerLeuProProProAsn 110 luArgValSerLysProLysIleSerTrpThrCysIleAsn .CTTACTTTCTTTTTAGAGAGGGTCTCAAAACCAAAGATCTCCTGGACTTGTATCAAC

...... TCTTTCTTTTAGAGAGGGTCTCAAAGCCCGTGATCCACTGGGAATGCCCCAAC TTTCCA MAAGCTGTGTGGATTTAT- -CTTCTTCAGGTG luArgValSerLysProValIleHisTrpGluCysProAsn 120 130 140 ThrThrLeuThrCysGluValMetAsnGlyThrAspProGluLeuAsnLeuTyrGlnAspGlyLysHisL ACAACCCTGACCTGTGAGGTAATGAATGGAACTGACCCCGAATTAAACCTGTATCAAGATGGGAAACATC TGTGTGCAGAACATTGTCACCTCCTGAGGCTGTGGGCCACAGCCACCTCTGCATCTTCGAACTCAGCCAT GTGGTCAACATCTGGAGTTTTTGGTCTCCTCAGAGAGCTCCATCACACCAGTAAGGAGAAGCAATATAAG ACAACCCTGACCTGTGCGGTCTTGCAAGGAACAGATTTTGAACTGAAGCTGTATCAAGGGGAAACACTAC TGTGATTGCAAGAATGGTAGAGGACCGAGCACAGAAATCTTAGAGATTTCTTGTCCCCTCTCAGGTCATG ThrThrLeuThrCysAlaValLeuGlnGlyThrAspPheGluLeuLysLeuTyrGlnGlyGluThrLeuL TGTAGATGCGATAAATCAAGTGATTGGTGTGCCTCGGTCTCACTACAAGCAGCCTATCTGCTTMGAGAC 150 160 TCTGGAGTTTCTTATGTGCCCTGGTGGACACTTGCCCACCATCCTGTGAGTAAAAGTGAMIAMGCTT Ex2 e uLys LeuSerGlnArgValIleThrHisLysTrpThrThrSerLeuSerAlaLysPheLysCysTh TGACTAG T-AAAA- -CTTTCTCAGAGGGTCATCACACACAAGTGGACCACCAGCCTGAGTGCAAAATTCAAGTGCAC TCAATAGTCTCCCCCAGAAGAACATGAGTTACCAGTGGACCA- -A-CCTGAGCGCACCATTCAAGTGTGA euAsnSerLeuProGlnLysAsnMetSerTyrGlnTrpThrA s nLeuSerAlaProPheLysCysGl 170 180 rAlaGlyAsnLysValSerLysGluSerSerValGluProValSerCysProG AGCAGGGAACAAAGTCAGCAAGGAATCCAGTGTCGAGCCTGTCAGCTGTCCAGGTGCG ...... ************ * * ** ************* * GGCGATAAACCCGGTCAGCAACGAGTCTAAGACGCGAAGTGTTAACTGTCCAGGTAAG ...... uAlaI leAsnProValSerLysGluSerLysThrGluValValAsnCysProG;

FIG. 2. Alignment of the individual exons from the human (h) and murine (m) T11 gene sequences. The canonical 5'- and 3'-splice sites are also shown at the ends of the respective exons. The leader exon (exon L) begins with the putative mRNA transcription initiation site in the human sequence. Numbering refers to codon positions with respect to the initial amino acid of the mature protein (aa + 1). The canonical polyadenylylation signal sequence AATAAA is underlined. The exon sequences from mouse and human were aligned with the LOCAL algorithm (25) with gaps introduced (dashed lines) as a result of the alignment. Nucleotide identities are indicated by asterisks. Exons L, Exl, Ex2, TM, and C are described in the legend to Fig. 1.

glycine between exons L and Exl) at each of the junctions in needed to anchor T11 in the plasma membrane. Nucleotide comparison of the human and murine genes. The nucleotide homology searches employing each of the human T11 exons homology between species ranges from 62.1% to 71.5% in separately failed to show any significant identities with comparison of individual exons of the T11 genes (Table 1) known genes other than T11 in GenBank.¶ This finding (overall homology 67.8%). indicates that each of the T11 exons is unusual in nature and Exons often encode functional domains within a given distinct from immunoglobulin genes, T-cell receptor sub- protein. The exon-intron organization of T11 is repre- units, or other members of the gene superfamily. sentative of a type I integral membrane protein (30) with It is worthy of comment that the T11 gene organization exon L encoding the 5'-untranslated sequence and the hy- differs substantially from that of the 8 subunit of T3 (20) drophobic leader peptide and with exon TM encoding the whose gene product is also selectively expressed in T- single hydrophobic membrane-spanning segment. In addi- lineage cells and is thought to be involved in signal trans- tion, exons Exi and Ex2 encode the hydrophilic extracellu- duction. Unlike T11, the T3 6-subunit gene is comprised of a lar T11 segment, and exon C encodes virtually the entire single exon that encodes the extracellular segment of the cytoplasmic segment and 3'-untranslated sequence. Of note, 11 putative cytoplasmic amino acid residues are also en- SNational Institutes of Health (1987) Genetic Sequence Databank: coded within exon TM (Fig. 2). However, these are predom- GenBank (Research Systems Div., Bolt, Beranek, and Newman, inantly charged amino acid residues that are presumably Cambridge, MA), Tape Release 50. Downloaded by guest on September 26, 2021 1618 Immunology: Diamond et al. Proc. Natl. Acad. Sci. USA 85 (1988) Table 1. Comparison of human and murine T11 exons (semiconservative change). These changes probably repre- Exon Intron aa % sent genetic polymorphism between BALB/cJ and B10.D2 Exon Source size, bp length, kb interrupted homology mouse strain. Each of these is restricted to exon Ex2 suggesting that there may be fewer evolutionary constraints L H 157* 0.77 Gly in this region of the gene. M 149* 0.108 Gly 70.5 Analysis of Til Gene Flanking Sequences. To compare Exi H 321 3.3 Glu in man and M 309 3.3 Glu 62.1 putative regulatory regions of the T11 gene Ex2 H 231 2.9 Glu mouse, both 5'- and 3'-flanking regions were analyzed. We M 231 2.9 Glu 68.3 obtained -450 bp of upstream flanking DNA sequence from TM H 123 3.9 Asp each of the T11 genes (Fig. 3). We provisionally determined M 123 3.9 Asp 71.5 the human cap site with a modification (31) of the Weaver C H 765 and Weissman method (32) of S1 nuclease analysis (data not shown). Alignment of human and mouse 5' sequences with M 374 70.8 the LOCAL algorithm (25) showed very significant homol- H, human; M, mouse. ogy (61.3%). Of note are several regions of near identity *Exon size is based on putative cap site. between the human and mouse genes (from positions - 249 to - 229 including 18 of 21 residues and from positions - 211 mature protein and several exons encoding cytoplasmic to -192 including 16 of 20 residues and from positions -1 to segments and 3'-untranslated sequence. Moreover, unlike + 25 including 21 of 26 residues) that may have some T-cell receptor subunit variable or constant region exons, common T-lineage importance for gene regulation. exon Exl of T11 lacks any codons for cysteine residues that We were unable to locate a canonical Goldberg-Hogness might be involved in potential intrachain disulfide bond box in the 5'-flanking sequence from either the murine or formation known to stabilize immunoglobulin domains human T11 genes that was consistent with the known sizes (21-24). of Til mRNA detected by RNA gel blot analysis (9-11). The extracellular T11 protein segment is involved in However, several upstream candidate sequences within the facilitating cell-cell contact as well as T-cell activation human gene (positions - 322, - 292, and - 280) and mouse events (1-7). Although it is likely that these functions may be gene (position - 360, data not shown) are present within the mediated by the same structural domain, the possibility 5'-flanking DNA. In this regard, several other genes have remains that the functions are compartmentalized into exons been shown not to contain the canonical "TATA" sequence, Exl and Ex2. In this regard, we have already observed that and in some of these cases [8 subunit of T3 (20); Thy-i (33); the T111 epitope, which presumably mediates conjugate hypoxanthine phosphoribosyltransferase (34)] multiple formation (i.e., sheep erythrocyte-T-cell-rosette formation mRNA initiation sites have been described. and cytotoxic T lymphocyte-target conjugate formation) by Results from RNA gel blot analysis (9) and cDNA cloning interacting with the ubiquitous surface structure lymphocyte have demonstrated that two distinct sizes (1.7 and 1.3 kb) of function-associated antigen 3 (4, 7), resides on a protein T11 mRNA exist in the human T cell (9-11). In contrast, fragment encoded exclusively by exon Exl (N. E. Richard- only a single 1.3-kb T11 mRNA is observed in murine and E.L.R., unpublished observation). Whether exon thymocytes (9, 13). Sequence analysis identified two poly- Ex2 encodes the domain responsible for initiating activation adenylylation signals (AATAAA) within the 3'-untranslated events, however, remains to be determined. region of the human T11 gene (Fig. 2, underlined), whereas The human genomic T11 sequence agrees with our pub- the mouse T11 cDNA contained only one such site. The lished cDNA sequence (9) with the following four differ- difference in size of the two types of human cDNA clones ences: (i) a cytidine in the genomic sequence instead of a (-400 bp) (9) can be most easily explained by differential use thymidine reported at position 338 of the cDNA; (ii) the addition of a cytidine after position 1037; (iii) TG instead of -34 2 - 300 GT at positions 1322 and 1323; and (iv) the addition of an h CTATTGGCTTGTGAACATTTACCTATATTTCTATGTGGTCTTGTTAGCGACAGTATACCTAAGTGCATAA adenosine after position 1336 (numbered according to PB1 m CTATT- -CCTATGACCTTTTGCCGGTCAGTCTACTTGGTCCTGTTGAACCCAGCACAGCTCAGTGGCCA cDNA in ref. 9). These differences are the result of sequenc- -250 ing errors in the published cDNA sequence (9). Only the AAGGCTGTCTGGTTGAATTTGGCTTCTTGTTTACAAAAGAGTGATCCTTAGTGATCTACTTAGCCTCTCT additional cytidine after position 1037 affects the predicted --- GCTATGTT----M-- -- GCC-CCTTGTTTACAAAAGGGT-AGC- -TA-T -CGCAACTGA-CCTCCCA -200 -150 T11 protein sequence. It causes a frameshift leading to a GTTCCTTTTC-TCTTTCACTGAGATCAGAAAACCTATCCTTCCCAATTTTTTTGTGTGAGAATTAAAATG predicted protein 13 amino acids shorter than reported and to a carboxyl terminus with better homology to the predicted GTTCCTCTTCTTCCTCTGGTGGGGTGCTAAAACCCA-ACACCCTAAGCTCTTTTTCTAAACATTAAAATG -100 murine protein (12). This change also agrees with the se- CAGCA-AGAAAACACACACTCATAAACACATCTGCTTTGGCAAAGGACCACATCAGAAGG - -GCTGGCTT quence reported by Seed and Aruffo (11). TGACACAGACACCTATTTCGGTTAAGGAGGGCAGCAAATGCATGAGTCGTTTTGATAGGGTCTCTGTCTC The murine T11 genomic sequence reported here is de- -50 rived from the BALB/cJ mouse strain. It differs at the GTCCGCGCT-CTTGC --- -TCTCTGTGTATGTGTATTATGTT ---- -TTATGTTACTGT-AAAAGATGTAAA following five nucleotide positions from the B10.D2 T11 TCTCTCTCTCCTTCCCCATCTCTACCTCTCCCTCTCCCCCTCCCCCTCCCCTACTGTGAACAGCAGGCAT cDNA sequence reported (12): (i) a guanosine in the genomic +1 +50 sequence instead of an adenosine at position 397 of the CAGAGGCACGTGGTTAAGCTCTCGGGGTGTCGACTCCACCACTCTCACTTCAGTTCCTTTTGCATCAAGA B10.D2 cDNA; (ii) an adenosine instead of a guanosine at GAAAGACACGTGGTTCAGGTTGCTGGGTGTGGCTTCTGCCAGCCTCACCACAG TCC - -TGACAGAAACA position 450; (iii) a guanosine instead of an adenosine at +99 nucleotide position 539; (iv) a guanosine instead of a thymi- GCTCAGAATCA-AAAGAGGAAACCAACCCCTAAGATC dine at position 588; and (v)-a cytidine instead of a thymidine ACTCAGAGTCACCCCTGGGAAAAGAACTCTAAAGATG at position 590 (numbered according to XB2 cDNA in ref. 12). These alterations result in four predicted amino acid FIG. 3. Comparison of the human (h) and murine (m) T11 gene 5'-flanking regions. Numbering refers to the putative transcription differences: at amino acid (aa) residue 106, valine instead of initiation site (position + 1) in human. Alignments were performed methionine (conservative change); at aa residue 153, serine as in Fig. 2. The homology value was highly significant at 195; instead of asparagine (conservative change); at aa residue weights used were as follows: for gaps, - (1.01 + 0.90 x length), 169, lysine instead of asparagine (nonconservative change); and for mismatches, -0.10 (ref. 25). The initial methionine codon and at aa residue 170, threonine instead of methionine has been double underlined. Downloaded by guest on September 26, 2021 Immunology: Diamond et al. Proc. Natl. Acad. Sci. USA 85 (1988) 1619 of polyadenylylation signals since both are present within Transfection analysis of the 12-kb T11 genes in cells of the T the genomic exon C (Fig. 2). To determine the DNA se- lineage will ultimately determine which sequences are im- quence at a comparable position flanking the mouse exon, portant for tissue-specific expression. we sequenced 360 bases downstream of the known murine Note Added in Proof. Further analysis of the human gene with the polyadenylylation site and compared mouse and human more sensitive RNase protection assay suggests multiple initiation nucleotide sequences. Alignment of the human and mouse sites (between positions + 16 and + 36 and positions + 57 and + 67), DNA sequences was done with the LOCAL algorithm (Fig. consistent with the absence of a canonical "TATA" sequence. 4). There is extensive homology between the two sequences (62.1%) with a minimum number of gaps introduced to align We acknowledge Forrest Nelson, Hema Ramachandran, and sequences. Surprisingly, the level of similarity between Cindy Knall for excellent technical assistance. D.J.D. was a Fellow of the Leukemia Society of America during the course of most of human 3'-untranslated sequence and murine 3'-flanking se- this work. L.K.C. was supported in part by Biomedical Research quence is essentially equal to that between human and Support Grants 2S07RRO5526-24 and ACS 118G. This work was murine T11 exons (Table 1). supported by Grant A121226 from the National Institutes of Health. Most interesting is the presence of a second canonical polyadenylylation signal in the murine 3'-flanking sequence 1. Meuer, S. C., Hussey, R. E., Fabbi, M., Fox, D., Acuto, O., Fitzgerald, K. A., Hodgdon, J. C., Protentis, J. P., Schlossman, S. F. & Reinherz, at a similar position to the downstream human site (Fig. 4). E. L. (1984) Cell 36, 897-906. Why this fails to give rise to a 1.7-kb mRNA species in 2. Siliciano, R. F., Pratt, J. C., Schmidt, R. E., Ritz, J. & Reinherz, E. L. murine thymocytes is unclear. Noteworthy is the substitu- (1985) Nature (London) 317, 428-430. 3. Fox, D. A., Hussey, R. E., Fitzgerald, K. A., Bensussan, A., Daley, tion of three adenosine residues in the human cDNA with J. F., Schlossman, S. F. & Reinherz, E. L. (1985) J. Immunol. 134, three guanosine residues in a similar position within the 330-335. murine flanking sequence. These differences may allow the 4. Plunkett, M. L., Sanders, M. E., Selvaraj, P., Dustin, M. L. & human polyadenylylation signal sequence to function as a Springer, T. A. (1987) J. Exp. Med. 165, 664-676. 5. Vollger, L. W., Tuck, D. T., Springer, T. A., Haynes, B. F. & Singer, site for poly(A) addition, resulting in the expression of an K. J. (1987) J. Immunol. 135, 358-363. mRNA with additional 3' sequence. Alternatively, a tran- 6. Martin, P. J., Longton, G., Ledbetter, J. A., Newman, W., Braun, scriptional stop sequence may have formed within a 3'- M. P., Beatty, P. G. & Hansen, J. A. (1983) J. Immunol. 131, 180-185. 7. Shaw, S., Ginther Luce, G. E., Quinones, R., Gress, R. E., Springer, flanking DNA of the murine T11 gene that precludes the use T. A. & Sanders, M. E. (1986) Nature (London) 323, 262-264. of the downstream AATAAA sequence. We have not found 8. Lay, W. H., Mendes, N. F., Bianco, C. & Nussenzweig, V. (1971) a unique functional role as yet for the longer 1.7-kb human Nature (London) 230, 531-533. T11 cDNA, since both the 1.7- and 1.3-kb mRNAs encode a 9. Sayre, P. H., Chang, H. C., Hussey, R. E., Brown, N. R., Richardson, N. E., Spagnoli, G., Clayton, L. K. & Reinherz, E. L. (1987) Proc. protein showing properties of the known T11 (9). However, Nat!. Acad. Sci. USA 84, 2941-2945. conservation of 3' sequence for at least several hundred 10. Sewell, W. A., Brown, M. H., Dunne, J., Owen, M. J. & Crumpton, bases (65%) suggests the possibility of a common down- M. J. (1986) Proc. Nat!. Acad. Sci. USA 83, 8718-8722. stream murine and human regulatory element of the T11 11. Seed, B. & Aruffo, S. (1987) Proc. Nat!. Acad. Sci. USA 84, 3365-3369. 12. Clayton, L. K., Sayre, P. H., Novotny, J. & Reinherz, E. L. (1987) Eur. gene. J. Immunol. 17, 1367-1370. Given that the level of nucleotide similarity between 5'- 13. Sewell, W. A., Brown, M. H., Owen, M. J., Fink, P. J., Kozak, C. A. and 3'-flanking regions of the T11 gene in both species is as & Crumpton, M. J. (1987) Eur. J. Immunol. 17, 1015-1020. great as the homology between murine and human T11 14. Williams, A. F., Barclay, A. N., Clark, S. J., Paterson, D. J. & Willis, A. C. (1987) J. Exp. Med. 166, 368-380. exons, it would appear likely that the T11 gene is similarly 15. Frischauf, A. M., Lehrach, H., Poustka, A. & Murray, N. (1983)J. Mol. regulated in both species. Consistent with this notion is the Biol. 170, 827-842. T-lineage restriction of T11 mRNA expression (9-11, 13). 16. Vieira, J. & Messing, J. (1982) Gene 19, 259-268. 17. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Nat!. Acad. Sci. *1250 USA 74, 5463-5467. h TGCATATCCC:TACTTCCATGAGGTGTTTTCTGTGTG(CAGAACATTGTCACCTCCT(A(;GC- Ts;TG(G( C(., A 18. Chen, E. Y. & Seeburg, P. (1985) DNA 4, 165-170. **** ** ***** * ****** ** * ** ** ***** *** * *,F >>v- I:. m AGCATGCCCACTCTTCCGTCTAGTGTTTAATGGAACTAGGACCCAAGTGCCTCCCCAGACTTGCA(sA( A 19. Fuse, A., Fujita, T., Yasumitsu, H., Kashima, N., Hasegawa, K. & Taniguchi, T. (1984) Nucleic Acids Res. 12, 9323-9331. t 1300 CAGCCACCTCTGCATCTT- CC -AACTCAGCCATGTGGTC -AACATCTGGAGTTTTTGC;TCTCTC(A-AsA 20. van den Elsen, P., Georgopoulos, K., Shepley, B. A., Orkin, S. & ** ** ** * ** *** ** ***** * ****** ** ** *** ** k*** Terhorst, C. (1986) Proc. Nat!. Acad. Sci. USA 83, 2944-2948. AAGTGGTGTCTTTTGATTAAGCTACACAGTCACTTGGTCTGGCCTCTG(GAATCT(GA.GGCCTCTTCTTAG A 21. Davis, M. M., Chien, Y. H., Gascoigne, N. & Hedrick, S. M. (1984) +1350 41400 Immunol. Rev. 81, 235-258. GCTCCATCACACCAGTAA-GGAGAAGCAATATAAGTGT(GATTGC;CAAGAATGGTAG;AGGACCG;A(;CA(-AC;A 22. Hayday, A. C., Diamond, D. J., Tanigawa, G., Folsom, G., Heilig,

- -TCTGTCGCGGCAGAAACACAGAAACCACAAACATGTGCACATAAGGATGTT --- -GCA- CAA(;- (AC; J. S., Saito, H. & Tonegawa, S. (1985) Nature (London) 316, 828-832. 23. Toyonaga, B., Yoshikai, Y., Vadasz, V., Chin, B. & Mak, T. W. (1985) + 1450 Proc. Nat!. Acad. Sci. USA 82, 8624-8628. AATCTTAGAGATTTCTTGTCCCCTCTCAGGTCATGTGTAGATGCGATAAATCAAGTGATTCuTGT(GCCTG 24. Malissen, M., Minard, K., Mjolsness, S., Kronenberg, M., Goverman, --- -CATAAAGATTTCACCT-TCCTCTCAGGTCCTCTACCCATGTGAGCTACCACATGACC(;CT(;-G-(-T- J., Hunkapiller, T., Prystowski, M. B., Yoshikai, Y., Fitch, F., Mak, +1500 *1550 T. W. & Hood, L. (1984) Cell 37, 1101-1110. GGTCTCACTACAAGCAGCCTATCTGCTTAAGAGACTCTGGAGTTTCTTATGTGCCCT(;GTG;GAC-Af.TT(,C 25. Smith, T. F., Waterman, M. S. & Burks, C. (1985) Nucleic Acids Res. 13, 645-655. CATGTC CTAC -AGGCTACCTG-AGCAGAGGTTC -GAGTCCCCCAGGTGCCCCAATGGACAG T1G 26. Breathnach, R. & Chambon, P. (1981) Annu. Rev. Biochem. 50, +1597 349-383. CCACCATCCTGTGAGTAAAAGTGAAATAAAAGCTTTGACTAG 27. Ellison, J. & Hood, L. (1982) Proc. Natl. Acad. Sci. USA 79, 1984-1988. CCA- CTTCCCTTGAGCAGGGGTAAAATAAAAATTTAACCCTG 28. Gray, P. M. & Goeddel, D. Y. (1982) Nature (London) 298, 859-863. 29. Fujita, T., Takaoka, C., Matsui, H. & Taniguchi, T. (1983) Proc. Nat!. FIG. 4. Analysis of T11 gene 3'-flanking regions. The polya- Acad. Sci. USA 80, 7437-7441. denylylation signal sequence is underlined. Note that the human 30. Wickner, W. T. & Lodish, H. F. (1985) Science 230, 400-407. 31. Larsen, P. R., Harney, J. M. & Moore, D. D. (1986) Proc. Nat!. Acad. sequence (h) is derived from exon C as shown in Fig. 2 beginning 30 Sci. USA 83, 8283-8287. bp downstream of the first polyadenylylation signal sequence. The 32. Weaver, R. F. & Weissman, C. (1979) Nucleic Acids Res. 7, 1175-1192. murine sequence (m) begins immediately on the 3' side of exon C as 33. Gigueve, Y., Isobe, K. I. & Grosveld, F. (1985) EMBO J. 4, 2017-2024. shown in Fig. 2. The homology value was 165 with parameters 34. Melton, D. W., Konecki, D. S., Brennand, J. & Caskey, C. T. (1984) identical to those in Fig. 3. Proc. Nat!. Acad. Sci. USA 80, 2147-2151. Downloaded by guest on September 26, 2021