Proc. Nad. Acad. Sci. USA Vol. 81, pp. 2650-2654, May 1984 Biochemistry Structure of the 5' ends of immunoglobulin : A novel conserved sequence (B lymphocyte-specific regulation/promoters) TRISTRAM G. PARSLOW*t, DEBRA L. BLAIRt, WILLIAM J. MURPHYt, AND DARYL K. GRANNER* Departments of *Internal Medicine and Biochemistry and the tDiabetes and Endocrinology Research Center, Veterans' Hospital, University of Iowa College of Medicine, Iowa City, IA 52240 Communicated by Leonard A. Herzenberg, December 30, 1983

ABSTRACT Recent investigations have suggested that tis- ed in the mechanism of DNA rearrangement (2). In addition, sue-specific regulatory factors are required for immunoglob- each V gene harbors at its 5' end a functional (5), ulin gene . Cells of the lymphocytoid pre- which can serve as the site of transcriptional initiation in a B-cell line 70Z/3 contain a constitutively rearranged immuno- fully assembled heavy or light chain gene. The precise se- globulin K light chain gene; the nucleotide sequence of this quences required for promoter function in these genes have gene exhibits all the known properties of a functionally compe- not yet been elucidated. tent transcription unit. Nevertheless, transcripts derived from We investigated the structure and expression of an immu- this gene are detectable only after exposure of the cells to bac- noglobulin light chain gene in the mouse leukemia cell line terial lipopolysaccharide, implying that accurate DNA rear- 70Z/3. Under ordinary growth conditions, cells of this line rangement is not sufficient to activate expression of the gene. constitutively express cytoplasmic A heavy chains without Comparison of the sequence of the 70Z/3 K light chain gene associated light chain synthesis, a phenotype characteristic with those encoding other immunoglobulin heavy and light of the early stages of B-lymphocyte ontogeny (6-8). Al- chains has revealed that a distinctive promoter region struc- though these cells harbor a single rearranged K light chain ture is characteristic of this multigene family. The sequence A- gene (7), they ordinarily contain neither K light chain protein T-T-T-G-C-A-T lies approximately 70 base pairs upstream nor its corresponding mRNA. When grown in the presence from the site of transcriptional initiation in every light chain of bacterial lipopolysaccharide (LPS), however, 70Z/3 cells gene examined; in heavy chain genes, the corresponding loca- accumulate cytoplasmic K light chain mRNA (8) and begin to tion is occupied by the precise inverse (A-T-G-C-A-A-A-T) of synthesize K light chain protein; after 12 hr of optimal LPS this sequence. Although adjacent regions of DNA have di- treatment, 70-100% express both heavy and light chain de- verged extensively in evolution, these octanucleotide sequences terminants on their surfaces (6). Therefore, the 70Z/3 cell are stringently conserved at this location among diverse immu- line can serve as a model system for the study of several noglobulin genes from at least two mammalian species. The critical events in early lymphocyte differentiation, including proximity of this conserved octanucleotide block to the site of K light chain gene activation and the induction of surface transcriptional initiation suggests that it may serve as a recog- immunoglobulin expression. nition locus for factors regulating immunoglobulin gene In this report, we present the complete V region and 5' expression in a tissue-specific fashion. flanking DNA sequences of the rearranged K light chain gene of 70Z/3 and further characterize the effects of LPS treat- The extraordinary diversity of immunoglobulin molecules is ment on the expression of RNA transcripts derived from this due, in part, to the existence of a large and heterogeneous gene. In addition, we identify a highly conserved octanucleo- family of genes encoding variable (V) region domains of the tide sequence at the 5' ends of diverse immunoglobulin V heavy chain and light chain polypeptides. The formation of a genes and present evidence that a distinctive promoter re- complete immunoglobulin transcription unit requires specif- gion structure is characteristic of this multigene family. ic DNA rearrangements that fuse a single V gene, along with its 5' flanking sequences, to separate genetic elements cod- MATERIALS AND METHODS ing for the remainder of the polypeptide chain. In the case of Cell Culture. 70Z/3 cells, a gift from R. P. Perry, were the murine K light chain genes, these rearrangements link the grown in suspension culture as described (8) in the presence selected V gene to one of fourjunctional (J) elements located or absence of LPS from Salmonella typhosa (10 ,ug/ml; upstream of the K light chain constant (C) region gene, CK (1, Difco). LPS treatment had no effect on either the doubling 2). Analogous DNA rearrangements are required for assem- time (12 hr) or the total polyadenylylated RNA content of bly of an intact heavy chain gene (3). these cells. Although their nucleotide sequences differ widely, all c Light Chain Gene Isolation and Sequence Determination. functional V genes share certain common structural features DNA was prepared from the nuclei of untreated 70Z/3 cells and are believed to have evolved from a single ancestral se- as described (9) and digested to completion with EcoRI. Di- quence. Each comprises two coding regions () separat- gested DNA (70 jig) was layered over a linear 10-40% su- ed by a short intervening sequence. The first encodes a crose gradient and centrifuged for 24 hr at 20°C in an SW 27 hydrophobic signal peptide; the second specifies -95 amino- rotor. Fractions containing restriction fragments greater terminal residues ofthe V . Each exon contains splice than 14 kb in length were pooled and concentrated by etha- junctions for RNA processing and specifies a few invariant nol precipitation. The DNA was then ligated into the EcoRI residues thought to be essential for proper func- arms of the bacteriophage Charon 4A, and a clone library tion of the protein (4). Two short-sequence elements found was prepared by the method of Maniatis et al. (10). This li- near the 3' ends of unrearranged V genes have been implicat- Abbreviations: LPS, bacterial lipopolysaccharide; bp, base pairs; The publication costs of this article were defrayed in part by page charge kb, kilobases or kilobase pairs; V, variable; J, joining; C, constant. payment. This article must therefore be hereby marked "advertisement" tPresent address: Department of Pathology, University ofCalifornia in accordance with 18 U.S.C. §1734 solely to indicate this fact. at San Francisco, San Francisco, CA 94143. 2650 Downloaded by guest on September 26, 2021 Biochemistry: Parslow et aL Proc. NatL Acad. Sci. USA 81 (1984) 2651 brary was screened for chimeric phage harboring sequences homologous to the C, region coding sequence by using the Cytoplasm Nucleus plaque hybridization technique of Benton and Davis (11); a total of 14 K light chain-bearing plaques were identified among the 68,000 plaques tested. Phage containing the rear- mapping of ranged C. allele were identified by restriction - 5.1 purified phage DNA. Portions of the cloned rearranged gene -4.4 were subcloned into the plasmid pBR322 and subjected to * -2.8 nucleotide sequence analysis by the method of Maxam and Gilbert (12). RNA Isolation. For isolation of cytoplasmic RNA, 109 -1.2 washed cells were lysed by gentle stirring for 5 min in 15 ml of ice-cold 10 mM Tris, pH 7.5/4 mM MgCl2/100 mM KCl/0.05% Triton X-100. After centrifugation for 5 min at LPS 3,000 rpm in an HB-4 rotor at 4TC, the supernatant was adjusted to 20 mM EDTA and 2% sodium dodecyl sulfate. FIG. 1. Induction of nuclear and cytoplasm c K light chain-spe- The nuclear pellet was washed in 15 ml of ice-cold 10 mM cific RNA in 70Z/3 cells exposed to bacterial LPS. RNA isolated sodium acetate, pH 5.0/50 mM NaCl and centrifuged as be- from nuclear and cytoplasmic fractions of LPS-treated and untreat- fore. Supernatants were combined with those from the previ- ed cells was subjected to denaturing agarose gel electrophoresis, ous step and then extracted twice against equal volumes of transferred to nitrocellulose membranes, and probed for C, se- phenol/chloroform, 1:1 (vol/vol), and twice against absolute quences. Cytoplasmic RNA samples contained 3 ,g of polyadenyly- The aqueous phases were adjusted to 0.5 M ammoni- lated RNA per lane; polyadenylylated nuclear RNA samples from ether. control and LPS-treated cells contained 64 ,g and 32 Aof RNA per um acetate/10 mM magnesium acetate/1 mM LDTA, and lane, respectively. Sizes of the various transcripts (lOi kb) are indi- RNA was precipitated by addition of 2.5 vol of ethanol. The cated. RNA samples were then incubated at 37°C for 1 hr with pro- teinase K (10 ug/ml) in 10 mM Tris, pH 7.5/0.5% sodium dodecyl sulfate, extracted twice against chloroform, and pre- and, presumably, represents the primary transcript of the cipitated from ethanol as before. gene. The two remaining homologous transcripts (4.4 and 2.8 Nuclear RNA was isolated from the nuclear pellets de- kb) are probably intermediate forms or by-products in pro- scribed above by cesium density centrifugation in the pres- cessing of the primary transcript (17). No such precursor ence ofguanidinium thiocyanate (13). The purified RNA was forms were detectable in the nuclei of cells grown in the ab- enriched for polyadenylylated sequences by two successive sence of LPS; the trace of mature K light chain mRNA found passes over an oligodeoxythymidylate affinity column. in nuclear RNA from untreated cells probably reflects the Hybridization Analysis of Cellular RNA. To detect K light presence of a small minority (<5%) of spontaneously acti- chain-specific transcripts, RNA samples were subjected to vated cells in the untreated population. electrophoresis on 1% agarose gels containing 6% (wt/vol) The entire rearranged K light chain gene is contained with formaldehyde, then transferred to nitrocellulose mem- a 17-kb EcoRI restriction fragment of 70Z/3 DNA (7, 15). To branes, and probed for CK region sequences by the method determine the structure of this gene, DNA isolated from un- of Thomas (14). The probe was a radiolabeled 300-base-pair treated 70Z/3 cells was cleaved to completion with EcoRI (bp) restriction fragment produced by combined HincII/ and cloned in the coliphage vector Charon 4A. Chimeric HinfI digestion of cloned K light chain cDNA (15). phage harboring the rearranged K light chain locus were iden- tified by plaque hybridization and partial restriction map- ping, and selected regions of the cloned gene were subjected RESULTS to nucleotide sequence analysis. Only one of the two C,. alleles of 70Z/3 is rearranged; the The sequence of a continuous 1369-bp region at the 5' end second allele retains the germ-line configuration (7, 15). of the rearranged gene is depicted in Fig. 2. The site ofjoin- LPS-stimulated induction of K light chain gene expression ing of the V and J elements can be identified between resi- occurs without detectable change in the sequence organiza- dues 1301 and 1302 (solid arrow); downstream from this site, tion of either allele (9). Nelson et al. (16) have observed that the nucleotide sequence is identical to that of the unrear- substantially larger quantities of K light chain-specific RNA ranged K light chain locus of murine embryo DNA (19). In are synthesized in vitro by nuclei isolated from LPS-treated 70Z/3, DNA rearrangement has resulted in fusion of the V 70Z/3 cells than by nuclei from untreated cells, implying that gene to the J element lying farthest upstream from the C,. LPS acts, at least in part, by inducing transcription of the locus; by convention, this element is designated J1 (residues constitutively-rearranged K light chain gene. To determine 1302-1339). Because the translational reading frame of the J1 the organization of this inducible transcription unit, we first element is known (2), the amino acid sequence of the V re- examined the sizes of the RNA transcripts produced. 70Z/3 gion (residues 1016-1339) could readily be deduced by using cells were grown for 20 hr in the presence or absence of LPS. as landmarks the relatively invariant amino acids found at Polyadenylylated RNA isolated from nuclear and cytoplas- particular positions in other K light chain proteins (4) and the mic fractions of the cells was subjected to agarose gel elec- recipient RNA splice site AGG at the 5' end of the second trophoresis under denaturing conditions, transferred to ni- exon (open arrow). trocellulose membranes, and probed for C, sequences. The first exon of a K light chain gene, comprising the signal The cytoplasm of untreated 70Z/3 cells contained no de- peptide coding element and the 5' untranslated sequences, is tectable K light chain-specific sequences (Fig. 1). As previ- typically separated from the second exon by a 100- to 400-bp ously reported by Perry and Kelley (8), however, exposure intervening sequence (18, 20-24). Although the sequence of to LPS resulted in the accumulation of mature-sized [1.2 ki- the signal varies considerably among K light lobase (kb)] cytoplasmic K light chain mRNA. Similar analy- chain genes, it generally consists of a single uninterrupted sis of nuclear RNA revealed that LPS treatment induced the translational reading frame encoding 16-26 amino acids, be- expression of three larger K light chain transcripts in addition ginning with the start codon ATG and ending with the donor to the mature mRNA. The largest of these is comparable in RNA splice site GGT. We have identified such a region at length (5.1 kb) to a complete K light chain transcription unit residues 776-826 of the sequence shown in Fig. 2. This re- Downloaded by guest on September 26, 2021 2652 Biochemistry: Parslow et al. Proc. Nad Acad Sci. USA 81 (1984)

100 GTCGTT1 T1IC G TL1G1bAGbIGbLbLNbAATLA IAbLAITITITAUTI 200

300

400

500 TATAAGTAAAATGTTTGATCAAGCTTATGACTTAAATTTGCAGACTCGGGGGCTGTCGATCTTTATAAATAAATGTAATTTATTTGAAAAGTGCTCTCAG 600

ICI 700

800 ATCTCACAGTTGGTTTAAAGCMAGTACTTATGAGAATAGCAGTAATTAGCTTGGGACCAAAATTCAAAGACAAAATGGATTTTCAAGTGCAGATTTTCA .i ii MetAspPheGl nValGl nIl ePheS GCTTCCTGCTAATCAGTGCT CAGGTAACAGAGGGCAGGGAATTTGAGATCAGAATCCAACCAAAATTATTTTCCCTGGGGMTTTGAATCTAAAATACA erPheLeuLeuIlITTTTTTTTTCTTTTTCGTTCATCTGAATGTTGGGTG(;TATAAAATTATTTTTGTATCTCTATTTTTACTAATCCeSerAl aSerICTCTGICTITTITICTITTI1000

1100 AGTCATAAGTCCAGAGGACAAATTGTTCTCTCCCAGTCTCCAGCAATCCTGTCtGCATCTCCAGGGGAGAAGGTCACAATGACTTGCAGGGCCAGCTCAA GlyGlnIleValLeuSerGi nSerProAlaIleLeuSerAlaSerProGlyGluLysValThrMetThrCysArgAlaSerSerS 1200 GTGTAAGTTACATGCACTGGTACCAGCAGAAGCTTGGATCCTCCCCCAMeCCATGGATTTATGCCACATCCMACCTGGCTTCTGGAGTCCCTGCTCGCTT erVal SerTyrletHi sTrpTyrGl nGl nLysLeuGl ySerSerProLysProTrplIleTyrAl aThrSerAsnLeuAl aSerGl yVal ProAl aArgPh 1300 CAGTGGCAGTGGGTCTGGGACCTCTTACTCTCTCACAATCAGCAGAGTGGAGGCTGAAGATGCTGCCACTTATTACTGCCAGCAGTGGAGTAGTAACCCA eSerGlySerGlySerGlyThrSerTyrSerLeuThrl 1eSerArgValGl uAl aGl uAspAl aAl aThrTyrTyrCysGl nGl nTrpSerSerAsnPro 1400 CGGACGTTCGGTGGAGGCACCAAGCTGGAAATCAAACGTAAGTAGAATCCAAAGTCTCTTTCTTCCGTT ...... ArgThrPheGlyGlyGlyThrLysLeuGl u0l eLysArg FIG. 2. V region and 5' flanking DNA sequence of the rearranged K light chain gene of 70Z/3 cells. The nucleotide sequences of both DNA strands were determined for the cloned rearranged gene by the method of Maxam and Gilbert (12). The predicted amino acid sequences of the signal peptide and V region domain are indicated, along with the donor and recipient RNA splice junctions (open arrows) and the site of DNA recombination (solid arrow). Nested brackets indicate a zone of partial inverted repeat symmetry within the signal coding region. Sequences of the J1 junctional region and of the "TATA" box homologue are underlined. Asterisks denote the putative site of transcriptional initiation, assuming a first exon 78 ± 2 bp long (18). The conserved sequence A-T-T-T-G-C-A-T (box) lies 154 bp upstream from the 3' end of the first exon.

gion also exhibits other features characteristic of light chain human K light chain genes (h122 and the unexpressed hJOO sighal-coding sequences: it contains a zone of partial invert- pseudogene) and the K light chain gene expressed by MPC11 ed repeat symmetry upstream from the terminal splice junc- mouse myeloma cells (MH) each contain a single base sub- tion and encodes a relatively hydrophobic peptide containing stitution within this octanucleotide (5, 18). In contrast, com- a central pair of leucine residues. No other region within the parison among the various genes reveals extensive sequence sequence depicted in Fig. 2 fulfills these criteria. divergence throughout the DNA regions flanking this octa- Kelley et al. (18) recently compared the 5' terminal struc- nucleotide. The conserved sequence is located 90-110 bp up- tures of several different V genes and observed that a uni- stream from the ATG start codon in nearly all of the light form distance of 78 ± 2 bp separates the initiation site from chain genes shown. For comparisons among immunoglob- the donor RNA splice junction at the 3' end of the signal ulin genes, however, the 3' end of the first exon is a prefera- coding region. Application of this principle suggests that ble landmark, as it occupies a fixed position with respect to transcription of the rearranged K light chain gene of 70Z/3 is the site of transcriptional initiation. In those genes for which initiated at residues 745-749 (asterisks in Fig. 2). This puta- the necessary sequence data are available, the conserved oc- tive initiation region is located -555 bp upstream from the tanucleotide lies 150 ± 10 bp upstream from the GGT splice 5' end of J1i which in turn lies 4406 bp upstream from the junction at the 3' end of the first light chain exon. the 5' flank- polyadenylylation site at the 3' end of the C, region (19). A search for this conserved octanucleotide in Assuming a tail of 50-250 adenylate residues, the primary ing regions of various immunoglobulin heavy chain genes led transcript of this rearranged gene would be expected to be to an unexpected finding. The sequence A-T-T-T-G-C-A-T 5.0-5.2 kb long, a value in excellent agreement with our rarely occurred in the published heavy chain gene sequences chain analysis of nuclear K light chain transcripts in these cells and was never observed at the location typical for light (Fig. 1). Subsequent RNA processing would generate a 1.1- genes. Instead, the corresponding position in heavy chain the conserved to 1.3-kb mature K light chain message. genes was occupied by. the precise inverse of (A- In examining the published sequences of a variety of sequence. As illustrated in Fig. 3, the inverse sequence immunoglo'bulin light chain genes, we observed that the oc- T-G-C-A-A-A-T), with only occasional alterations, is pres- tanucleotide A-T-T-T-G-C-A-T is nearly always present ent 150 ± 10 bp upstream from the first RNA splice junction -100 bp upstream from the 5' end of the signal peptide cod- in all of the heavy chain genes for which adequate data are ing region. Fig. 3 depicts portions of the 5' flanking se- available (28-30). quences of several niurine and human light chain V genes, aligned with respect to this conserved octanucleotide. The DISCUSSION sequence A-T-T-T-G-C-A-T occurs without variation in which assemble the J, and eight of the nine murine K light chain genes shown (20-25), as Specific DNA rearrangements, V, a are well as in the murine XI and XI, light chain V regions (26, 27) CK coding elements to form single transcription unit, of a functional and the human hil1 K light chain gene locus (5). Two other essential for the synthesis immunoglobulin Downloaded by guest on September 26, 2021 Biochemistry: Parslow et aL Proc. Natl. Acad. Sci. USA 81 (1984) 2653

-20 -10 1 10 20 30 the elements of certain eukaryotic viruses-ele- ATG GGT ments that can act in cis over a distance of several kilobases Murine Kappa Genes to increase the rate of transcription from cellular promoters 70Z/3 TGCCTAGACTGTATCTTGCG ATTTGCAT ATTACA M TCAGTAACCACAA 107 154 (34-36). Changes in chromatin structure of the region near K2 GCTGTGCCTACCCTTTGCTG ATTTGCAT GTACCCAAAGCATAGCTTACTG 100 147 the CK locus, occurring as a result of LPS treatment, may serve to activate an enhancer-like element at this site, which M173B ATCCTAACTGCTTCTTAATG ATTTGCAT ATCCTCACTACATCGCCTTGGG 91 146 could, in turn, activate the promoter of the rearranged V M41 ATCCTAACTGCTTCTTAATA ATTTGCAT ACCCTCACTGCATCGCCTTGGG 92 144 gene. M167 ----CAGCACTGACCAATGG ATTTGCAT AATGCTCCCTAGGGTCCACTTC 106 153 Our analysis of the K light chain gene of 70Z/3 revealed a T1 GCAATAACTGGTTCCCAATG ATTTGCAT GCTCTCACTTCACTGCCTTGGG 97 146 previously unrecognized feature of the 5' flanking sequences of immunoglobulin V genes. In every known instance, a dis- T2 AGCAACATGAAGACAGTATG ATTTGCAT AAGTTTTTCTTTCTTCTAATGT 109 156 tinctive octanucleotide sequence occurs 150 ± 10 bp up- K21C AAACAGTACATACTCCGCTG ATTTGCAT ATGAAATAATTMTATAACAGCC 93 143 stream from the 3' end of the first exon: the sequence A-T-T- M11 ACTTCCTTATTTGATGACTC CTTTGCAT AGATCCCTAGAGGCCAGCACAG 73 148 T-G-C-A-T is found at this location in light chain V genes, Murine Lambda Genes while the inverse (complementary) sequence A-T-G-C-A-A- A-T is characteristic of heavy chain V genes. Although adja- XI TAAACCTGTAAATGAAAGTA ATTTGCAT TACTAGCCCAGCCCAGCCCATA 106 151 cent sequences on either side have diverged extensively in AII TAAACCTGTAAATGAAAGTA ATTTGCAT TACTAGCCCAGCCCAGCCCATA 105 151 evolution, these octanucleotides have been selectively con- Human Kappa Genes served among diverse V genes in at least two mammalian hlOl GCCTGCCCCATCCCCTGCTC ATTTGCAT GTTCCCAGAGCACAACCTCCTG 96 NA species. Of particular note, the octanucleotides are more stringently conserved than the A+T-rich region (TATA box) h122 GCCTGCCCCATCCCCTGCTG ATTTGCCT GTTCCTAGAGCACAGCCCCCTG 102 NA found =30 bp upstream from the initiation site in these (18) hlOO TCATTCTTGCATCTGTTGAA ATTTTCAT TTTCAAAAAAACACAGCCAACT 96 NA and other genes. Of the five mutations identified in Fig. 3, three occur in unrearranged genes cloned from embryonic VH167 TAATGATATAGCAGAAAGAC ATGCAAAT TAGGCCACCCTCATCACATGAA 118 NA tissues; curiously, all five mutations represent purine-py- rimidine transversions. VH105 GTAATGCACTGCTCATGAAT ATGCAAAT CACCTGGGTCTATGGCAGTAAA 114 159 In conjunction with the findings of Kelley et al. (18), our VHlll GTAATGCACTGCTCATGAAT ATGCAAAT CACGCAAGTCTTTGGCAGTAAA 108 153 observations suggest a consensus structure for the 5' end of VH104 GAAGTACCCTGCTCATGAAT ATGAAAAT TACCCAAGTCTATGUTAGTMA 108 153 an immunoglobulin gene (Fig. 4). This structure consists of a VH108 AAAGTCCCCTGCTCATGAAT ATGCAAAT TACCGTTCTCTATGTTGGTTAA 109 154 first exon (coding region and 5' untranslated sequence) mea- TCTCTCAGGAACCTCCCCCA ATGCAAAG CAGUCCTCAGGCAGAGGATAAA 85 143 suring 78 ± 2 bp in length, along with only two short regions VH101 of sequence conservation in the 5' flanking DNA: a TATA FIG. 3. Conserved octanucleotide sequences in the 5' flanking box homologue and the octanucleotide block, situated "30 regions of immunoglobulin genes. Portions of the published nucleo- and 70 bp, respectively, upstream from the initiation site. tide sequences of several V genes are aligned to demonstrate the Despite considerable variability in such parameters as the selective conservation of the octanucleotides A-T-T-T-G-C-A-T and location of the ATG start codon and the length of the first A-T-G-C-A-A-A-T in light chain and heavy chain genes, respective- intervening sequence, essentially all published V gene se- ly. Point mutations within these conserved sequences are under- quences conform to this archetypal pattern. We have uncov- lined. The relative locations of the ATG start codon and of the GGT splice junction at the 3' end of the first exon are indicated for each ered no evidence that the octanucleotide sequences are sys- gene. Murine light chain gene sequences are from this paper (70Z/3) tematically associated with the 5' ends of genes encoding and from refs. 20 (K2), 21 (M173B), 22 (M41), 23 (M167), 24 (T1 and nonimmunoglobulin proteins (37), although the sequence A- T2), 25 (K41C), 26 (XA), and 27 (X11). The M11 data are from a cor- T-T-T-G-C-A-T occurs at an intriguingly similar location in rected version (R. P. Perry, personal communication) of the se- the heavy chain gene of HLA-DR (38), a protein evolution- quence in ref. 18. Human K light chain sequences are from ref. 5. arily related to the immunoglobulins. The organization ofthe The VH167 and VH101 sequences are from refs. 28 and 29, respec- pseudogene promoter of the unrearranged CK locus differs in tively; the remaining heavy chain data are from ref. 30. NA, re- several respects from the consensus V gene structure and quired sequence data not available. includes no detectable to the octanucleotide block (39). protein. These rearrangements are not, however, sufficient The selective evolutionary conservation of the octanu- to activate transcription of the K light chain gene. We have cleotide block implies that it may serve a significant biologic demonstrated that untreated cells of the 70Z/3 line contain a fully rearranged K light chain gene; the sequence of this gene Heavy ATGCAAAT reveals no obvious anomalies that might interfere with its Light ATTTGCAT Init Splice transcriptional or translational function. Nevertheless, we -- can detect virtually no transcripts derived from this gene in S- TATAK , ATG GGT untreated 70Z/3 cells. Expression of this gene can be in- duced by exposure to LPS, resulting in the accumulation of K 4 light chain mRNA in the cytoplasm (8) and of its precursor 78 2bp 2bp transcripts in the nucleus. 150 ±l0bp This observation implies that factors other than accurate joining of the V and J loci are required to activate K light FIG. 4. Consensus structure of the 5' ends of immunoglobulin chain gene transcription. Recently, attention has focused genes. The first exon, comprising the signal coding sequence (cross- upon events occurring within a small (<250 bp) region of hatched) and 5' untranslated regions, extends from the transcrip- DNA closely linked to the CK locus (9, 15, 31-33). This re- tional initiation site (Init) to the first donor RNA splice junction, a gion, which lies -3.5 kb downstream from the site of tran- distance of 78 ± 2 bp (18). An A+T-rich region (TATA box) is locat- ed '30 bp upstream from the initiation site. The octanucleotide scriptional initiation, undergoes localized changes in chro- ± matin structure block, situated 150 10 bp upstream from the 3' end of the first that correlate with transcriptional activity of exon, exhibits one of two complementary sequences: A-T-T-T-G-C- the gene; in 70Z/3 cells, these chromatin changes occur after A-T in light chain genes or A-T-G-C-A-A-A-T in heavy chain genes. exposure to LPS (9). Chung et al. (33) have observed that the The remainder of the 5' flanking sequence varies widely among dif- nucleotide sequence of this region is homologous to that of ferent immunoglobulin genes. Downloaded by guest on September 26, 2021 2654 Biochemistry: Parslow et al. Proc. Nad Acad ScL USA 81 (1984)

function. The nature of this function, however, remains ob- 14. Thomas, P. S. (1980) Proc. Natl. Acad. Sci. USA 77, 5201- scure. In every V gene studied, the octanucleotide lies up- 5205. stream from the site oftranscriptional initiation; consequent- 15. Parslow, T. G. & Granner, D. K. (1983) Nucleic Acids Res. ly, it is not transcribed and cannot be directly involved in 11, 4775-4792. to Recently, sev- 16. Nelson, K., Mather, E. & Perry, R. P. (1984) Nucleic Acids events that occur subsequent transcription. Res. 12, 1911-1923. eral laboratories have presented evidence that factors unique 17. Perry, R. P., Kelley, D. E., Coleclough, C., Seidman, J. G., to B lymphoid cells are essential for transcriptional activa- Leder, P., Tonegawa, S., Matthyssens, G. & Weigert, M. tion of V gene promoters (32, 40-43). Some of these B-cell (1980) Proc. Natl. Acad. Sci. USA 77, 1937-1941. factors may be required to activate the enhancer-like activity 18. Kelley, D. E., Coleclough, C. & Perry, R. P. (1982) Cell 29, of sequences downstream from the V gene. Our findings 681-689. raise the additional possibility that certain tissue-specific 19. Max, E. E., Maizel, J. V. & Leder, P. (1981) J. Biol. Chem. factors may recognize distinctive sequences at the 5' termini 256, 5116-5120. of these genes and, thereby, select the appropriate site for 20. Nishioka, Y. & Leder, P. (1980) J. Biol. Chem. 255, 3691-3694. binding to 21. Max, E. E., Seidman, J. G., Miller, H. & Leder, P. (1980) Cell transcriptional initiation. Alternatively, factors 21, 793-799. the octanucleotide sequences may serve to modulate the rate 22. Seidman, J. G., Max, E. E. & Leder, P. (1979) Nature, (Lon- of immunoglobulin gene transcription during various stages don) 280, 370-375. of B-lymphocyte ontogeny. Clearly the most remarkable 23. Selsing, E. & Storb, U. (1981) Cell 25, 47-58. property of the octanucleotide block is the fact that, at this 24. Altenburger, W., Steinmetz, M. & Zachau, H. G. (1980) Na- unique site, the sequences of light chain genes are precisely ture (London) 287, 603-607. complementary to those of heavy chain genes. We propose 25. Heinrich, G., Traunecker, A. & Tonegawa, S. (1984) J. Exp. that this distinctive sequence element may have a role in mo- Med. 159, 417-435. lecular events that require discrimination between these two 26. Hozumi, N., Wu, G. E., Murialdo, H., Roberts, L., Vetter, D., Fife, W. L., Whiteley, M. & Sadowski, P. (1981) Proc. diverse classes ofgenes, perhaps serving as a binding site for Natl. Acad. Sci. USA 78, 7019-7023. transcription factors mediating the coordinate regulation of 27. Wu, G. E., Govindji, N., Hozumi, N. & Murialdo, H. (1982) heavy chain and light chain gene expression. Nucleic Acids Res. 10, 3831-3843. We thank C. Katzen and C. Caldwell for technical assistance, M. 28. Clarke, C., Berenson, J., Goverman, J., Boyer, P. D., Crews, Granner for software, and K. R. Yamamoto for reviewing the manu- S., Siu, G. & Calame, K. (1982) Nucleic Acids Res. 10, 7731- script. This work was supported in part by National Institutes of 7749. Health Grant AM25295 and by funds from the Veterans Administra- 29. Kataoka, T., Nikaido, T., Miyata, T., Moriwaki, K. & Honjo, tion. T.G.P. was supported by Medical Scientist Training Grant T. (1982) J. Biol. Chem. 257, 277-285. GM07337. D.K.G. was a Veterans Administration Medical Investi- 30. Cohen, J. B., Effron, K., Rechavi, G., Ben-Neriah, Y., Zakut, gator. R. & Givol, D. (1982) Nucleic Acids Res. 10, 3353-3370. 31. Weischet, W. O., Glotov, B. O., Schnell, H. & Zachau, H. G. 1. Seidman, J. G. & Leder, P. (1978) Nature (London) 276, 790- (1982) Nucleic Acids Res. 10, 3627-3645. 795. 32. Queen, C. & Baltimore, D. (1983) Cell 33, 741-748. 2. Max, E. E., Seidman, J. G. & Leder, P. (1979) Proc. Natl. 33. Chung, S.-Y., Folsom, V. & Wooley, J. (1983) Proc. Natl. Acad. Sci. USA 76, 3450-3454. Acad. Sci. USA 80, 2427-2431. 3. Early, P., Huang, H., Davis, M., Calame, K. & Hood, L. 34. Gruss, P., Dhar, R. & Khoury, G. (1981) Proc. Natl. Acad. (1980) Cell 19, 981-992. Sci. USA 78, 943-947. 4. Wu, T. T. & Kabat, E. A. (1970) J. Exp. Med. 132, 211-240. 35. Baneri, J., Rusconi, S. & Schaffner, W. (1981) Cell 27, 299- 5. Bentley, D. L., Farrell, P. J. & Rabbits, T. H. (1982) Nucleic 308. Acids Res. 10, 1841-1856. 36. Wasylyk, B., Wasylyk, C., Augereau, P. & Chambon, P. 6. Paige, C. J., Kincade, P. W. & Ralph, P. (1978) J. Immunol. (1983) Cell 32, 503-514. 121, 641-647. 37. Dayhoff, M. O., Chen, H., Hunt, L. T., Barker, W. C., Yeh, 7. Maki, R., Kearney, J., Paige, C. J. & Tonegawa, S. (1980) Sci- L.-S., George, D. G. & Orcutt, B. C., eds. (1983) ence 209, 1366-1369. Sequence Database (National Biomedical Research Founda- 8. Perry, R. P. & Kelley, D. E. (1979) Cell 18, 1333-1339. tion, Washington, DC). 9. Parslow, T. G. & Granner, D. K. (1982) Nature (London) 299, 38. Das, H. K., Lawrance, S. K. & Weissman, S. M. (1983) Proc. 449-451. Natl. Acad. Sci. USA 80, 3543-3547. 10. Maniatis, T., Hardison, R. C., Lacy, E., Lauer, J., O'Connell, 39. Van Ness, B. G., Weigert, M., Coleclough, C., Mather, E. L., C., Quon, D., Sim, G. K. & Efstratiadis, A. (1978) Cell 15, Kelley, D. E. & Perry, R. P. (1981) Cell 27, 593-602. 687-701. 40. Gillies, S. D., Morrison, S. L., Oi, V. T. & Tonegawa, S. 11. Benton, W. D. & Davis, R. W. (1977) Science 1%, 180-182. (1983) Cell 33, 717-728. 12. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65, 41. Banerji, J., Olson, L. & Schaffner, W. (1983) Cell 33, 729-740. 499-560. 42. Falkner, F. G. & Zachau, H. G. (1982) Nature (London) 298, 13. Parslow, T. G., Milburn, G. L., Lynch, R. G. & Granner, 286-288. D. K. (1983) Science 220, 1389-1391. 43. Stafford, J. & Queen, C. (1983) Nature (London) 306, 77-79. Downloaded by guest on September 26, 2021