<<
Home , D

MOLECULAR AND CELLULAR BIOLOGY, Nov. 1993, . 6957-6968 Vol. 13, No. 11 0270-7306/93/116957-12$02.00/0 Copyright © 1993, American Society for Microbiology (D) Recombination Coding Junction Formation without DNA Homology: Processing of Coding Termini NIKOLAI V. BOUBNOV,"12 ZACHARY P. WILLS,1 AND DAVID . WEAVER' 3* Division of Tumor Immunology, Dana-Farber Cancer Institute, 1 and Departments ofMedicine2 and Microbiology and Molecular Genetics,3 Harvard Medical School, Boston, Massachusetts 02115 Received 5 May 1993/Returned for modification 23 June 1993/Accepted 18 August 1993 Coding junction formation in V(D)J recombination generates diversity in the antigen recognition structures of immunoglobulin and T-cell receptor molecules by combining processes of deletion of terminal coding sequences and addition of nucleotides prior to joining. We have examined the role of coding end DNA composition in junction formation with plasmid substrates containing defined homopolymers flanking the recombination signal sequence elements. We found that coding junctions formed efficiently with or without terminal DNA homology. The extent of junctional deletion was conserved independent of coding ends with increased, partial, or no DNA homology. Interestingly, / homopolymer coding ends showed reduced deletion regardless of DNA homology. Therefore, DNA homology cannot be the primary determinant that stabilizes coding end structures for processing and joining.

Immunoglobulin (Ig) and T-cell receptor (TCR) variable sequence homologies at V(D)J recombination breakpoints genes are assembled from germ line variable (V), diversity have been observed under conditions in which there is no (D), and joining (J) gene segments during lymphoid differen- ambiguity about the rearrangement partners in endogenous tiation in a programmed rearrangement process termed Ig events (6, 10, 11, 27) or about whether the rearrangements V(D)J recombination. For each rearrangement event, a pair studied are involved in immune system selection pressures of recombination signal sequences (RSS) flanking each V, D, (2, 18). Clearly, coding end homology is not absolutely or J element contains the necessary DNA sequence infor- required for V(D)J gene rearrangement to proceed; however, mation to direct recombination when recombination sub- the generality of the use of DNA homology for the reaction strates are introduced to recombinase-positive cell lines (1, mechanism is uncertain. 17). RSS are composed of a conserved palindromic heptamer In this report, we examine the mechanism of coding (consensus = CACAGTG) separated by either a 12- or 23-bp junction formation by using V(D)J rearrangement cassettes spacer from a nonamer (consensus = ACAAAAACC). The containing defined coding DNA sequences. Joints formed by two recombination products are an RSS junction (signal coding ends with complete, limited, and no homology all had junction) and a coding junction. Whereas RSS junctions are similar features of restricted base loss of P-nucleotide addi- formed with precision, coding junctions produce extensive tion. However, the nucleotide composition of coding ends junctional variability. was found to influence the processing of coding end inter- Coding junction variability is of primary importance for mediates. Thus, coding joint resolution in V(D)J recombina- creating diversity in the immune system (42). Coding junc- tion must be flexible to accommodate several processing tion diversity originates by a complex process of deletion alternatives irrespective of DNA terminal homology. and addition of nucleotides prior to DNA ligation that is completed by several proteins. Nucleotide addition is asso- ciated with terminal deoxynucleotidyltransferase (TdT) ac- MATERIALS AND METHODS tivity (19, 21). The genes required for coding end deletions Plasmid construction. V(D)J recombination plasmid sub- have not been identified. However, the scid mutation pref- strates were derived from pJH290 (17) by deletion of the erentially disturbs the process of coding junction formation pJH290 RSS to facilitate oligonucleotide substitution. The during V(D)J recombination, leading to aberrant rearrange- RSS with the 12-bp spacer [RSS(12)] flanked by two Sall ment products with unusually extensive deletions (14, 24, 25, sites of pJH290 was removed and replaced with an incom- 37; reviewed in reference 36). An interesting feature of scid plete RSS containing a unique SalI and EagI site with the cells is that they are -ray sensitive, suggesting that scid EagI site in the 12-bp spacer [SalI-(EagI 12-bp spacer)- functions in double-strand break repair in DNA damage nonamer]. Next, the RSS with the 23-bp spacer [RSS(23)] responses (3, 7, 13). The observations of the asymmetry of was removed by BamHI digestion and replaced with an V(D)J recombination reaction products and the require- incomplete RSS containing a unique BamHI and XhoI site ments for several distinct gene products are indicative of a with the XhoI site in the 23-bp spacer [BamHI-(X7zoI 23-bp multiprotein assembly required for coding junction forma- spacer)-nonamer]. The cloning orientation of these oligonu- tion. cleotides was determined by DNA sequencing, and we Specific coding sequences have not been thought of as selected pNB108 for construction of coding junction recom- contributing to the V(D)J reaction mechanism per se, be- bination templates. cause of the large divergence of DNA sequence of V, D, and We next added double-stranded oligonucleotides contain- J regions flanking the RSS elements. Nevertheless, short ing Sall and EagI ends followed by addition of double- stranded oligonucleotides containing BamHI and XhoI ends in sequential constructions. Insertion of double-stranded * Corresponding author. oligonucleotides containing specific sequences (coding end 6957 6958 BOUBNOV ET AL. MOL. CELL. BIOL.

DNA) flanking a heptamer element recreated an RSS with either the appropriate 12- or 23-bp spacer. Therefore, the 5'-GGGGGGGGGG 12 zF-____<2.2a. TTTTTTTTTT-3' RSS(12) for these constructions is 5'-CACAGTGCGGCC 3'-CCCCCCCCCC AAAAAAAAA-5' GACTGGAACAAAAACC, and the RSS(23) is 5'-GG'TT1 TGIACAGCCAGACAGTGGAGCTCGAGCACTJGT (hep- tamer and nonamer elements are underlined). Gn, An, T., and Cn ( = 5 or 10), Ri (5'-GTCGACAGT-CACAGTG- 5'-GGGGGGGGGG,TTTTTTTTTT-3' EagI), and R2 (5'-XhoI-CACTGTG-GCGCCTGGATC- 3'-CCCCCCCCCC:AAAAAAAAAA-5' BamHI), flanking the heptamer elements in the oligonucleo- tides for cloning, were synthesized. Single homopolymer FIG. 1. V(D)J recombination substrates measuring coding junc- recombination substrates were also matched with a specific tion formation. The recombination zone of V(D)J recombination sequence of mixed nucleotide composition (R1) flanking the substrate pG10/T10-Cd is shown. Open triangles represent an RSS(12), and shaded triangles represent an RSS(23). Plasmids are RSS(12). named on the basis of the nucleotide compositions of their two Cell culture, transient transfection, and the V(D)J recom- coding sequences. Of the two products for V(D)J recombination, the bination assay. HDR37A -lymphoma cells contain 10 copies junction remaining on the plasmid following rearrangement is illus- of stably integrated RAGI and RAG2 genes expressed under trated. See text for a discussion of the plasmid bacterial selection the control of the Drosophila HSP70 heat shock-inducible scheme diagnostic of gene rearrangements in mammalian cells. promoter (29). 18-8 is an Abelson murine leukemia virus- transformed pre-B-cell line that is recombinase positive without induction. Cells were maintained at 37°C in a 5% cording to the instructions of the supplier when the M13 CO2 atmosphere in RPMI 1640 supplemented with 10% reverse primer was required. heat-inactivated fetal calf serum, 100 U of penicillin-strepto- mycin per ml, and, for HDR37A cells, 1.5 mM -histidinol (Sigma). In each experiment, 200 ng of plasmid DNA was RESULTS transfected into 3 x 106 HDR37A or 18-8 cells, using the Coding end terminology. To study the mechanism of DEAE-dextran method previously described (16, 17, 29). coding junction formation, we constructed a series of V(D)J For HDR37A transfections, expression of RAGI and RAG2 rearrangement cassettes differing by the composition of their was activated 6 after transfection by incubation at 43°C for coding ends (DNA sequences flanking the RSS elements). 1 h and further propagation at 37°C. Plasmid substrates are named by the nucleotide composition The coding junction (Cd) plasmids were substrates for of their two coding ends, where the nucleotides joining V(D)J recombination in which gene rearrangement is scored together on the top strand are shown (Fig. 1). The strand that in a bacterial transformation assay. Removal of the A oop is 5'-(coding sequence)-(CACAGTG-12-bp spacer-ACAAA transcriptional terminator during rearrangement in mamma- AACC) and (GGTTTITGT-23-bp spacer-CACTGTG)-(cod- lian cells then allows transcription of the chloramphenicol ing sequence)-3' is designated for each substrate. In pG10/ acetyltransferase gene in the plasmid introduced into bacte- T1o-Cd, the two coding ends contain homopolymers of G ria, as described elsewhere (16, 31). Plasmid DNA was flanking the RSS(12) and T1o flanking the RSS(23). Rear- recovered from mammalian cells by Hirt lysis at 36 h rangement leads to a coding junction on the plasmid where (HDR37A) and 48 h (18-8) after transfection. Recovered the G1o and T1o regions need to be ligated on the same plasmid DNAs were digested with DpnI to remove DNA not strand. replicated in mammalian cells prior to electroporation into V(D)J rearrangement of templates containing mixed nucle- Escherichia coli DH1OB (9). The transformation efficiency of otide composition at coding ends. Plasmid substrates (pRl/ electrocompetent DH1OB was approximately 1.5 x 1010 R2-Cd and pJH290) were introduced into HDR37A cells, and colonies per ,g of pUC19. Recombinants from mammalian V(D)J rearrangement was scored (Materials and Methods). cells were detected by plating DH1OB transformants on The pRl/R2-Cd and pJH290 coding ends mimic V(D)J coding ampicillin (100 ,ug/ml)-chloramphenicol (20 ,ug/ml) (Cam sequences in that they have a mixed DNA composition. If Amp) plates. Total transfected and replicated plasmid DNA limited DNA homology (also called DNA overlap) affects the (Dpnjr) was quantitated by plating on ampicillin (100 ,g/ml) formation of coding joints when available, then we would (Amp) plates. Recombination frequencies of recovered plas- expect these residues to appear at the junction borders very mid DNAs from transfected cells were determined as previ- frequently. An alignment of coding ends might be possible as ously described (16, 17); recombination frequency (percent) a direct consequence of the end structures following cleav- = (number of Camr Ampr colonies/number of Ampr colo- age, or it may occur following nucleolytic processing of the nies) x 100. Recombination levels were normalized to the ends and/or base addition. We calculated the DNA overlap recombination level of pJH290 in each experiment as dis- potential for the terminal six nucleotides for the coding end cussed in the text. partners of pRl/R2-Cd. A range of 0 to 6 bp deleted per DNA sequencing. DNA sequencing analysis was carried coding end is generally observed during V(D)J recombina- out with Sequenase (U.. Biochemical). Sequencing primers tion. pRl/R2-Cd has one-nucleotide redundancy at three were 5'-ATGTGAGTTAGCTCACTCATTAGGC, 121 bp residues on the RSS(12) side and six residues on the RSS(23) from RSS(12); 5'-ACGATGCCATTGGGATATATCAAC side within first six nucleotides. Single-nucleotide DNA GG, 79 bp from RSS(23); and M13 reverse primer, 29 bp overlap has previously been implicated to be used in V(D)J from RSS(12). Homopolymer G and/or C tracks frequently recombination (10). In 40% (15 of 37) of the pRl/R2-Cd generated compressions at the transitions from G to C or C rearrangements, the junctional residues were ambiguous in to G polymers. Compressions were strand specific on 6% assignment to either of the two coding ends (Fig. 2). The denaturing polyacrylamide gels, thus necessitating sequenc- incidence of these events by chance occurrence was calcu- ing on both strands under the inosine modification of the lated to be 44% (16 of 36). For pJH290 rearrangements, DNA sequencing protocols. Mn2" conditions were also used ac- overlap occurs in 61% (30 of 49) of the joints that are either VOL. 13, 1993 V(D)J RECOMBINATION CODING JUNCTION FORMATION 6959 Coding-RSS( 12) Coding-RSS(23) CAGGTCGACAGT GCGCCTGGATCGGAT C12 P12 P23 C23

1/37 CAGGTCGACAGT A -CGCCTGGATCGGAT 0 1 0 -1 1/37 CAGGTCGACAGT ----CTGGATCGGAT 0 0 -4 3/37 CAGGTCGACAG2 ------GGATCGGAT O O 0 -6 2/37 CAGGTCGACAG- C GCGCCTGGATCGGAT -1 0 1 0 1/37 CAGGTCGACAG- -CGCCTGGATCGGAT -1 0 0 -1 3/37 CAGGTCGACAG- ---CCTGGATCGGAT -1 0 0 -3 3/37 CAGGTCGACA------TGGATCGGAT -2 0 0 -5 3/37 CAGGTCGAC--- GCGCCTGGATCGGAT -3 0 0 0 1/37 CAGGTCGAC--- -CGCCTGGATCGGAT -3 0 0 -1 1/37 CAGGTCGAC--- --GCCTGGATCGGAT -3 0 0 -2 1/37 CAGGTCGAC------CCTGGATCGGAT -3 0 0 -3 3/37 CAGGTCGAC------CTGGATCGGAT -3 0 0 -4 1/37 CAGGTCG----- GC GCGCCTGGATCGGAT -5 0 2 0 2/37 CAGGTCG----- GCGCCTGGATCGGAT -5 0 0 0 1/37 CAGGTCG------CGCCTGGATCGGAT -5 0 0 -1 3/37 CAGGTCG------GCCTGGATCGGAT -5 0 0 -2 1/37 CAGGTCG------CCTGGATCGGAT -5 0 0 -3 1/37 CAGGTCG------TGGATCGGAT -5 0 0 -5 2/37 CAGGTCG------GATCGGAT -5 0 0 -7 2/37 CAGGTC------CGCCTGGATCGGAT -6 0 0 -1 1/37 CA------CGGAT -10 0 0 -10 FIG. 2. Coding junctions of a randomized coding end substrate, pRl/R2-Cd, in HDR37A cells. The coding junction plasmid, pRl/R2-Cd, was examined for V(D)J recombination by transient transfection of HDR37A cells (RAGI and RAG2 heat shock inducible). The structures of the coding junctions are illustrated, with P nucleotides underlined and in the center of these junctions where observed. The extent of deletion is noted graphically as a dashed line. The left column shows the number of identical coding joints divided by the total number coding junctions evaluated for each substrate. Underlined nucleotides in coding sequences cannot be assigned unambiguously to either coding end. Values for C12 and C23 are the number of nucleotides deleted from the coding-RSS(12) and coding-RSS(23) ends, respectively. Values for P12 and P23 are the number of P nucleotides retained from the coding-RSS(12) and coding-RSS(23) ends, respectively. All recombinants listed are independent isolates. single or dinucleotides (31) and is found by chance at 75%. combination frequency in HDR37A cells (Ray = 5.2; Tables With limited overlap potential, V(D)J recombinase does not 1 and 2). For each experiment listed with defined coding end appear to preferentially resolve the junctions at positions of substrates, the recombination frequency (percent) is cor- DNA homology. rected for the recombination frequency of pJH290 in the Recombination substrates containing nonhomologous DNA same experiment (., [%]). We refer to the average cor- at coding ends. Eukaryotic cells are capable of end joining of rected recombination level (Ray) for each substrate in com- DNA strands that have no overt DNA homology at their parisons with other substrates. Using this normalization, ends (32). V(D)J recombination may potentially duplicate pRl/R2-Cd produces a recombination frequency that is these features of generalized end joining during the process about 30% of the values for pJH290 on average (Ray = 30). of coding junction synthesis. Similarly, the mechanics of Rl and R2 are specific mixed nucleotide composition coding coding end processing could include the recruiting of short ends added to the recombination substrate plasmid by oligo- stretches of DNA homology to enhance particular junctional nucleotide insertion (Materials and Methods). outcomes. To experimentally resolve the DNA sequence Templates with a T5 or T1o homopolymer flanking RSS(23) preferences for coding junction formation, we constructed and mixed nucleotide composition (R1) flanking RSS(12) V(D)J recombination templates containing homopolymeric were less efficient as recombination substrates (Table 1). DNA sequences at the coding ends participating in the Recombinants from pRl/T1O-Cd (Ra, = 0.5) were found at rearrangement event. The use of homopolymers afforded us 1.7% of the level observed for pRl/R2-Cd, whereas pRl/ the ability to monitor the fate of coding end sequences in T5-Cd (Ray = 3.0) was diminished only to 10% of the coding joint synthesis by unambiguously identifying each pRl/R2-Cd level. This level of reduction of recombination nucleotide. We used 10-mer and 5-mer homopolymers that frequency could indicate that coding junction product for- would extend beyond the average deletion size (3 to 4 bp per mation was dependent on coding end DNA sequences. coding strand). Alternatively, the reduced recombination frequency could To evaluate the role of coding end sequences in determin- indicate inhibition of initiation steps in V(D)J recombination ing the V(D)J recombination frequency, we compared the such as RSS recognition or cleavage. RSS and coding recombination potential of each substrate from a minimum junction formation are uncoupled in the reaction mechanism of three independent experiments with that of a standard (15, 22, 24, 38). Furthermore, RSS ends are not processed by substrate, pJH290. pJH290 yields a reproducibly high re- the same machinery used for coding ends because of the 6960 BOUBNOV ET AL. MOL. CELL. BIOL.

TABLE 1. Recombination frequencies of V(D)J substrates with nonhomologous coding ends No. Substrate Expt No. Amp' Am&ap(rRRaR ±cr(%P CaMrR () SD)C Mixed nucleotide ends (pJH290) 1 147,000 16,000 10.9 100 NA 2 158,000 8,400 5.3 3 1,170,000 46,800 4.0 4 1,420,000 65,900 4.6 5 1,520,000 62,300 4.1 6 1,110,000 38,900 3.5 7 2,110,000 78,700 3.7 8 1,080,000 27,900 2.6 Nonhomologous coding ends pR/F75-Cd 3 800,000 420 0.053 1.3 3.1 ± 2.0 5 1,140,000 1,300 0.114 2.8 7 2,620,000 5,100 0.195 5.3 pRl/T1O-Cd 3 510,000 220 0.043 1.08 0.5 ± 0.5 5 1,780,000 100 0.006 0.15 7 1,650,000 120 0.007 0.19 pGl/JT5-Cd 4 550,000 4,600 0.84 18.3 10 + 7 6 2,360,000 3,200 0.14 4.0 7 1,510,000 4,600 0.30 8.1 pG1O/T1O-Cd 1 192,000 930 0.484 4.4 2.8 1.4 4 660,000 1,040 0.158 3.4 6 2,350,000 1,160 0.049 1.4 7 1,280,000 880 0.069 1.9 pC10/T5-Cd 4 1,260,000 910 0.072 1.6 2.4 ± 0.8 6 2,280,000 2,000 0.088 2.5 7 1,500,000 1,800 0.120 3.2 pC1JT0o-Cd 4 730,000 300 0.041 0.89 0.6 ± 0.3 6 3,220,000 290 0.009 0.26 7 2,020,000 590 0.029 0.78 p(CG)5/(TA)5-Cd 2 99,000 1,360 1.37 26 27± 15 3 260,000 5,000 1.92 48 6 2,970,000 16,600 0.56 16 8 830,000 3,610 0.43 17 a Calculated as percentage Ampr Camr colonies from Ampr colonies. b Recombination frequency corrected for the recombination frequency of pJH290 in the same transfection experiment. Average corrected recombination frequency. NA, not applicable. characteristically precise RSS-RSS junction. Since no cod- that the ability to rejoin nonhomologous coding ends is ing end sequences appear in RSS junctions, any effect of actually quite efficient once cleavage events have occurred. coding sequences would be expected to occur at initiation When matched with the same alternate coding end (G or C), steps. there is a consistent reduction of 3.6- to 6.2-fold in the If the initiation of V(D)J recombination were affected by recombination efficiency when T1o instead of T5 coding ends homopolymer coding sequences, then we would expect that are used (Table 1). This effect is also probably related to the RSS product formation would be as low as coding junction ability to initiate recombination, since RSS junctions form product formation. Signal junction product formation was more poorly with T1o than with T5 homopolymers (5). The significantly reduced (>100-fold) in substrates containing A substrate p(CG)5/(TA)5-Cd (Ra, = 27) had no homology at and T homopolymers, whereas G and C homopolymers coding ends but produced a recombination frequency that appeared to have lesser effects on the cleavage reaction (5). was similar to the value for pR1/R2-Cd (Ra. = 30). Presum- A comprehensive examination of the effects of homopoly- ably, this substrate undergoes initiation of V(D)J recombi- mer DNA sequences on the initiation of V(D)J recombina- nation normally, yet there appears to be no effect of terminal tion is presented in a separate study (5). The similarity in low nonhomology in influencing the recombination level. product formation for coding and signal junctions suggests The structures of coding junctions from nonhomologous that the effects of the T homopolymer are mostly on initia- coding ends were examined to determine the extent of tion steps. deletion. Coding junctions were formed between nonhomol- Nonhomologous homopolymers at the two coding ends ogous DNA sequences in nearly all cases, showing the were also tested in other V(D)J substrates (pG1o/T1o-Cd, similar junctional deletion or P-nucleotide addition as normal pG1o/T5-Cd, pC1oJT1o-Cd, and pC1O/T5-Cd). The average coding ends (Fig. 3). The average deletion sizes for T1o corrected recombination levels of these templates in coding ends was 3.1 bp when matched with G1o ends (2.1 bp) HDR37A cells were 3- to 50-fold lower than those of and 6.8 bp when matched with C1o ends (2.6 bp). In 78 pR1/R2-Cd (pG1o/T1O-Cd [Ray = 2.8], pG1o/T5-Cd [Ra, = 10], coding junctions containing G1o or C1o, we never observed pC1o/T1o-Cd [Ray = 0.6], and pC10/T5-Cd [Ray = 2.4]) (Table deletion beyond the homopolymer track. For the 60 rear- 1). Considering that the presence of a T1o homopolymer rangements characterized in which one homopolymer end diminishes the level of initiation of V(D)J recombination in was T1o, there was deletion beyond the T homopolymer in signal junction plasmids by > 100-fold (5), these data suggest only 10% (6 of 60) (Fig. 3A and B). A low fraction (3 of 22) VOL. 13, 1993 V(D)J RECOMBINATION CODING JUNCTION FORMATION 6961 A. pGlo/Tlo-Cd B. pCIO/Tio-Cd

Coding-RSS(12) Coding-RSS(23) Coding-RSS(12) Coding-RSS(23) GGGGGGGGGG TTTTTTTTTTG C12 P12 P23 C23 CCCCCCCCCC TTTTTTTTTTGGATCCCCG C12 P12 P23 C23

2/38 GGGGGGGGGG --TTTTTTTTG 0 0 0 -2 1/22 CCCCCCCCCC --TTTTTTTTGGATCCCCG 0 0 0 -2 1/38 GGGGGGGGGG C --TTTTTTTTG 0 1 0 -2 1/22 CCCCCCCCCC ------CG 0 0 0 -17

4/38 GGGGGGGGG ----TTTTTTG 0 1 0 -4 2/22 CCCCCCCCC------TGGATCCCCG -1 0 0 -9 1/38 GGGGGGGGGG ------TTTTTG0 2 0 -5 3/22 CCCCCCCC-- -TTTTTTTTTGGATCCCCG -2 0 0 -1 1/38 GGGGGGGGGG ------G0 1 0 -10 2/22 CCCCCCCC------TTTTTTTGGATCCCCG -2 0 0 -4 4/38 GGGGGGGGG- --TTTTTTTTG -1 0 0 -2 1/22 CCCCCCCC------TTTTTGGATCCCCG -2 0 0 -6 6/38 GGGGGGGGG- ---TTTTTTTG -1 0 0 -3 2/22 CCCCCCCC------TTTGGATCCCCG -2 0 0 -7 1/38 GGGGGGGG-- TTTTTTTTTTG -2 0 0 0 3/22 CCCCCCCC------TTGGATCCCCG -2 0 0 -8 1/38 GGGGGGGG-- TTTTTTTTTTG -2 0 1 0 2/22 CCCCC.------G -2 0 0 -18 2/38 GGGGGGGG-- -TTTTTTTTTG -2 0 0 -1 1/22 CCCCCCC--- -TTTTTTTTTGGATCCCCG -3 0 0 -1 1/38 GGGGGGGG-- --TTTTTTTTG -2 0 0 -2 2/22 CCCCCC------TTTTTTGGATCCCCG -4 0 0 -4 1/38 GGGGGGGG-- ---TTTTTTTG -2 0 0 -3 1/22 CCCC ------TTTTTGGATCCCCG -6 0 0 -5 1/38 GGGGGGGG------TTTG -2 0 0 -7 1/22 CCCC ------TCCCCG -6 0 0 -13 1/38 GGGGGGGG------G -2 0 0 -10 5/38 GGGGGG------TTTTTTTTG -4 0 0 -2 1/38 GGGGGG------TTTTTTG -4 0 0 -4 1/38 GGGGGG------TTTTTG -4 0 0 -5 1/38 GGGGG------TTTTTG -5 0 0 -5 1/38 GGGG------TTTTTTTTTG -6 0 0 -1 D. pClo/T5-Cd 1/38 GGG------TTTTTTTG -7 0 0 -3 1/38 GG------TTTTTTG -8 0 0 -4 CCCCCCCCCC TTTTTGGATCCCCG C12 P12 P23 C23

5/18 CCCCCCCCCC ri -----GGATCCCCG 0 1 0 -5 1/18 CCCCCCCCCC r ------G. 0 1 0 -13 1/18 CCCCCCCCC------GGATCCCCG -1 0 0 -5 1/18 CCCCCCCC-- --TTTGGATCCCCG -2 0 0 -2 C. pRI/Tlo-Cd 2/18 CCCCCCCC------GGATCCCCG -2 0 0 -5 1/18 CCCCCCCC------ATCCCCG -2 0 0 -7 CAGGTCGACAGT TTTTTTTTTTGGAT C12 P12 P23 C23 1/18 CCCCCCC------GGATCCCCG -3 0 0 -5 1/18 CCCCCC------TTGGATCCCCG -4 0 0 -3 1/14 CAGGTCGACAG: ----TTTTTTTGGAT 0 0 0 -4 2/18 CCCCCC------TGGATCCCCG -4 0 0 -4 6/14 CAGGTCGACA------TTTTTTTGGAT -2 0 0 -4 1/18 CCCCC -----GGATCCCCG -5 0 0 -5 1/14 CAGGTCGA------TTTGGAT -4 0 0 -7 1/18 CCCCC ------TCCCCG -5 0 0 -8 1/14 CAGGTCO;&------T -4 0 0 -13 1/18 ------GGATCCCCG -10 0 0 -5 2/14 CAGGTCG ------TTTTTTTTTGGAT -5 0 0, -1 2/14 CAGGTC ------TTTTTTTTTGGAT -6 0 0 -1 1/14 C ------TTTTGGAT -24 0 0 -6

FIG. 3. Coding junctions formed from nonhomologous coding end termini. Three coding junction plasmids were transfected into HDR37A cells, and coding junctions were scored as to whether each of the paired coding ends has no terminal DNA homology or restricted DNA homology. pG1o/T1O-Cd (A), pC1J/T1O-Cd (B), pC1O/T5-Cd (C), and pRl/T1O-Cd (D) each contain 5 or 10 bp of G, T, or C homopolymers as noted. The extent of coding junction deletion and P nucleotides are as in Fig. 2. See text for details of these recombination substrates.

of pC1JT10-Cd joints had deletions of 17 and 18 bp with be used to stimulate more efficient joining by virtue of its junctions occurring in a distal site of short DNA overlap. ability to pair with the other coding end. Short (1- to 2-bp) These exceptional cases of coding end deletion might indi- DNA overlap was placed in otherwise nonhomologous cod- cate the use of extended deletion in order to find a region of ing end termini and at the positions for the average strand DNA overlap. However, analysis of 18 pC1JT5-Cd junctions deletion (residues -3 to -4 on each coding sequence). Three indicated that none occurred in the equivalent region of V(D)J recombination substrates with internal DNA overlap DNA overlap even though this region was 5 bp closer (Fig. yielded similar recombination frequencies (DNA overlap is 3D). In the substrate pRl/T10-Cd, T1o is matched with a underlined) pG3CAG2/T2AT3-Cd (Ra8 = 33), pG3CAG4/ coding end that offered a one-base overlap at position 1. T25AT3-Cd (Rav = 39), and pG4AG2/T3AT3-Cd (Ray = 26) Only 1 of 14 joints was overlapped at T position 1. The (Table 2). The average recombination level is very similar to average size of deletions was between 3 and 4 bp from each that of the pR1/R2-Cd substrate (Rav = 30). For pG3AG2I end in nonhomologous regions (Fig. 3C). Therefore, joining T2CAT3-Cd, DNA overlaps occurred in 7 of 19 cases (Fig. occurs efficiently in regions without DNA homology. How- 4). This frequency is not higher than chance occurrence over ever, the extent of deletion may be flexible, and the DNA the first six residues. Interestingly, none of the seven cases sequences of coding ends may influence this process (see were located at the introduced dinucleotide homologies. below). Therefore, introduction of limited DNA homology does not Junctional deletion is conserved in templates containing increase the likelihood of utilizing homology to increase DNA homology at coding ends. We designed two tests in recombination efficiency or to recruit those sequences in the which coding end DNA homology is purposely introduced junctions formed. into these V(D)J recombination substrates. We speculated Using the homopolymer strategy, we constructed two that for a site of DNA homology to be important for joining, additional substrates containing 10 bp of perfect DNA ho- it may have to be exposed by the nucleolytic processing of mology at the coding ends. Applying the same logic, a the coding ends during the reaction. If it appeared as a terminal redundancy of 10 residues that are all G or C may be single-stranded end with the proper positioning, then it may expected to allow the creation of the most stable pairing 6962 BOUBNOV ET AL. MOL. CELL. BIOL.

TABLE 2. Recombination frequencies of V(D)J substrates with DNA homology at coding ends' R, Substrate Expt No. Ampr No.Cam'Amp' R % R % (% R±t SD) Mixed nucleotide ends (pJH290) 1 147,000 16,000 10.9 100 NA 2 200,000 11,200 5.6 3 158,000 8,400 5.3 4 2,490,000 180,000 7.2 5 1,170,000 46,800 4.0 6 1,420,000 65,900 4.6 7 1,520,000 62,300 4.1 8 1,110,000 38,900 3.5 9 1,080,000 27,900 2.6 Internal homology pG_CAG2T2CAT3-Cd 3 26,000 850 3.27 62 33 ± 20 4 2,330,000 49,200 2.11 29 8 2,570,000 16,100 0.63 18 9 930,000 5,200 0.56 22 pG3CAG4/T2CAT3-Cd 3 6,000 110 1.83 35 39 ± 17 4 1,760,000 73,400 4.17 58 8 2,510,000 21,600 0.86 25 pG4AG2/T3AT3-Cd 3 127,000 630 0.50 9 26 ± 18 4 2,240,000 71,200 3.18 44 9 1,470,000 9,200 0.63 24 Extended homology pG1O/GjO-Cd 2 93,000 270 0.29 5.2 8 ± 3 6 310,000 1,480 0.48 10.4 9 2,170,000 4,400 0.20 7.7 pClj/ClO-Cd 6 230,000 1,870 0.81 17.6 9 ± 7 7 1,830,000 3,500 0.19 4.6 9 3,280,000 4,850 0.15 5.8 P-nucleotide end homology pG1jCjO-Cd 1 260,000 3,300 1.27 11.7 12 ± 2 4 1,460,000 12,100 0.83 11.5 9 1,320,000 4,940 0.37 14.2 pC1o/G1o-Cd 4 650,000 3,100 0.48 6.7 5 ± 2 7 1,310,000 1,800 0.14 3.4 9 1,370,000 1,220 0.09 3.5 aFor details, see footnotes to Table 1. interactions at the coding ends. In HDR37A cells with use of P nucleotides are more frequent in coding junctions with G the substrates pG1JG1O-Cd and pC1O/C1O-Cd, Ray = 8 and 9, and C homopolymer coding ends. Nucleotide addition in respectively (Table 2). These values were higher than that V(D)J recombination arises by two paths, untemplated N for nonhomologous sequences with T1o coding ends (Ray = regions and templated P nucleotides (20, 26). Although N 0.6 to 2.8) but less than that for nonhomologous sequences nucleotides are added following the loss of nucleotides with (TA)5 coding ends (Ray = 27) (Table 1). Similarly, G/C during coding junction formation, P nucleotides are detect- homopolymer substrates without coding end homology able only without deletion and thus are ordinarily very (pG1JC1o-Cd and pC1dG1o-Cd [Rav = 12 and 5, respective- infrequent. An important issue is whether P nucleotides are ly]) had recombination levels in the same range. Thus, always generated in the reaction and whether they are the extensive homology at coding ends does not significantly precursors to processing steps. We observed P nucleotides enhance the recombination potential of coding ends. in pG1JT1O-Cd and pC1l/T5-Cd: 18% (7 of 38) for G coding We then determined whether a dramatically increased ends and 15% (6 of 40) for C coding ends (Fig. 3). P potential for DNA homology in pG1JG1O-Cd and pC1J nucleotides present at T coding ends in these substrates were C1o-Cd would lead to decreased coding end processing by not found at an increased level: 1% (1 of 78). The P DNA sequencing of recombination junctions. Junctional nucleotides were always mono- or dinucleotides fitting the deletion on one or both strands occurred in 30 of 32 definition of the inverse complement of the undeleted coding rearrangements. We could not discriminate deletions on one end. Therefore, it is very unlikely that these nucleotides or the other coding end because they were composed of the were N nucleotides added by TdT. For comparison, in 37 same nucleotide (G or C); however, the total joint deletion junctions with pRl/R2-Cd, we observed 4 junctions consis- was evaluated. The average deletions of Glc/Glo and ClJ/C1o tent with the retention of P nucleotides (10%) (Fig. 2), where junctions were 8.0 bp (4.0 bp per end) and 5.0 bp (2.5 bp per three-fourths of the junctions with P nucleotides arose from end), respectively (data not shown), values that are nearly the G+C-rich R2 coding end. identical to the average deletion in recombination junctions P nucleotides are hypothesized to form by the asymmetric from random sequences. These results show that extensive cleavage of coding end hairpins, leaving 3' or 5' extended coding end homology does not disrupt the coding end residues for processing (28, 33). In these cases, terminal processing function or make these templates more efficient DNA overlap present in the single-stranded P-nucleotide substrates for V(D)J recombination. arms could effectively interact with the other coding ends. P VOL. 13, 1993 V(D)J RECOMBINATION CODING JUNCTION FORMATION 6963

Coding-RSS( 12) Coding-RSS(23) GACGGGCAGG TT-ATTTGGAT C12 P12 P23 C23

1/19 GACGGGCAGG ----TTTGGAT O 0 0 -4 1/19 GACGGGCAGG ------TGGAT O 0 0 -6 1/19 GACGGGCAGG ------GGAT O 0 0 -7 1/19 GACGGGCAGG ------AT O a 0 -9 1/19 GACGGGCAG------TTGGAT -1 0 0 -5 1/19 GACGGGCAG------AT -1 0 0 -9 2/19 GACGGGCA-- A TTCATTTGGAT -2 0 1 0 1/19 GACGGGCA------ATTTGGAT -2 0 0 -3 1/19 GACGGGCA-- -T------T-2 0 0 -10 2/19 GACGGGC--- -TCATTTGGAT -3 0 0 -1 1/19 GACGGG------GAT -4 0 0 -8 1/19 GACGGG------AT -4 0 0 -9 2/19 GA------A TTCATTTGGAT -8 0 1 0 3/19 GA------TTTGGAT -8 0 0 -4 FIG. 4. Coding junctions formed with coding end termini containing restricted DNA homology. The coding junction plasmid pG3CAG2 T2AT3-Cd was transfected into HDR37A cells, and V(D)J recombination coding junctions were evaluated. Underlined residues in the plasmid are a CA dinucleotide that is redundant for both coding ends. The extent of coding junction deletion and P nucleotides are as in Fig. 2. nucleotides from both coding ends could be used for forma- otides, especially if there is an asymmetric processing of the tion of short DNA overlaps to initiate junction formation. two coding ends (see below). These matched coding ends would be expected to be gener- Processing is reduced for G and C coding ends irrespective ated in the pG1dG1O-Cd and pC1JC1O-Cd substrates and may of the partner. Our data may suggest that coding ends are result in increased retention of P nucleotides. We did not processed differentially depending on the nucleotide compo- observe any examples of double-P-nucleotide retention in 12 sition of the ends. We calculated the fraction of events for rearrangements of pG1JG1o-Cd; however, 2 of 12 had P which no deletion was observed at a particular coding end in nucleotides from one coding end. With pC1JC1o-Cd junc- several of the substrates (Fig. 5). Interestingly, G and C tions, 1 of 20 was potentially formed by the use of a homopolymers in our defined templates very frequently had P-nucleotide overlap from each coding end. Junctions where undeleted coding ends. In contrast, T homopolymers were both C ends were undeleted were observed in only 10% (2 of almost never observed without deletion. The two random- 20) of pC1JC1O-Cd junctions, and for G ends, undeleted ized sequences, Rl and R2, were informative in that the junctions were not found (0 of 12 pG1JG1O-Cd joints). more G+C-rich coding end (R2) had less deletion than the Although pairing events using double P nucleotides may be more A+T-rich coding end (Rl) (Fig. 5). As mentioned possible, these data are more consistent with a model whereby this type of association does not predominate in junction resolution. 80 We also constructed pG1JC1o-Cd and pC1JG1O-Cd sub- C strates in which P nucleotides from one end would produce , 70 R G T short DNA overlap potential internally in the other coding end. The recombination frequencies of the pG1JC1O-Cd and 60 pC1JG1o-Cd templates (Rav = 12 and 5, respectively; Table 0) 50 2) were in the same range as those of V(D)J substrates C.) containing G/C-rich coding ends but an extended terminal -0 3o homology (pG10/G1o-Cd and pC1J/C1O-Cd) (Ray = 8 and 9; V 30 Table 2). Thus, no enhancement in the recombination fre- quency is found by promoting DNA overlap of a P-nucleo- tide-extended coding end with the other coding end. Interest- *0 ingly, pG1JC1o-Cd and pC10/G1o-Cd coding junctions were distinctly different, having deletions of significantly reduced size at both the G and C coding ends in HDR37A cells (Fig.

20 In 6). Undeleted G and C ends constitute 46% (12 of 26) and r cm 38% (10 of 26), respectively, of the pG1J/C1O-Cd junctions. a: Ca The average deletions for G coding ends (0.9 bp) and C FIG. 5. Deletion of coding termini of specific nucleotide ho- coding ends (1.4 bp) also were considerably reduced, indi- mopolymers during V(D)J recombination. Undeleted coding junc- tions were calculated for the recombination products scored from cating less processing. P nucleotides retained from one several V(D)J substrates containing defined coding end sequences. coding end before joining could not be unambiguously The percentage of undeleted coding ends is plotted relative to the identified in the presence of deletion on the other coding end. coding end composition for each substrate. The first residue listed However, the observed lack of deletion in many of the signifies the homopolymer or randomized polymer used in the junctions is consistent with the presence of some P nucle- substrate (in parentheses) for the calculation. 6964 BOUBNOV ET AL. MOL. CELL. BIOL. previously, G and C homopolymer ends had lower than DISCUSSION average deletion sizes, and in some cases T homopolymers had higher than average deletion sizes (Fig. 3). When no Parameters of coding junction formation in V(D)J recom- terminal DNA homology is present (pC1O/T1O-Cd, pG1JT1o- bination were investigated by using defined DNA sequences Cd, and pC101T5-Cd), there is a preference for deletions at in the positions of coding ends flanking RSS. We show that the T homopolymers rather than the G or C homopolymers there is no tendency toward the recruitment of DNA overlap in the pair, although the average deletion per junction is between coding ends to efficiently form recombination prod- normal (Fig. 3). Because junctional deletions can be asym- ucts. Instead, the regular features of coding junction diver- metric between the two coding ends, we also calculated the sity, such as junctional deletion, N regions, and P nucle- otides, are displayed as expected. These features were average deletion size from either the RSS(12) or RSS(23) shown for nonhomologous, limited homology, complete coding end to be 3.3 or 2.7 bp, respectively. Therefore, it is homology, and mixed nucleotide composition coding se- unlikely that the preference for coding end deletion is quences. Importantly, the extent of junctional deletion can directed by the RSS(12) or RSS(23) sequence. Instead, it be significantly reduced by coding end combinations in appears that coding end composition influences the extent of which exclusively G and C homopolymeric residues are junctional deletion. used. The DNA processing functions are not constricted by The examination of coding junctions formed from pG1J DNA homology so that a number of distinct enzymatic C1o-Cd, pC1OG1o-Cd, and pG1JG1o-Cd suggests that the two functions are chosen among to generate junctional diversity coding ends of the V(D)J recombination reaction are usually (Fig. 7). The resolution of coding junctions can occur with- processed asymmetrically (Fig. 6). Therefore, both the ex- out strict selectivity for the composition of ends to be joined. tent of deletion and the coordination of processing of both Junctional deletion occurs with or without terminal DNA coding ends can be modulated by coding end DNA compo- overlap. We have shown that DNA overlap is not needed and sition. does not increase the efficiency of coding junction formation N-region addition. N-region addition is strongly associated in V(D)J recombination using substrates with homopolymers with the enzyme TdT during lymphoid differentiation. In at their coding ends. Substrates that maximize overlap by addition to creating a broader recognition diversity, N-re- providing 10 bp of complete homology did not enhance the gion addition could have a stimulatory function in the recombination frequency of recovered products and did not efficiency of V(D)J recombination coding junction forma- decrease the average junctional deletion. These recombina- tion. For example, substrates with a poor ability to join tion events are recombinase dependent and do not occur by coding ends because of lack of DNA homology might be reciprocal recombination between the homopolymer direct expected to yield a higher level of products when N addition repeats (4). Similarly, we found that a short dinucleotide or could be used to create homology. 18-8 Abelson murine mononucleotide redundancy at the most frequent positions leukemia virus-transformed pre-B cells were used as trans- for junctional deletion did not influence the placement of the fection recipients for V(D)J recombination substrates to coding junctions formed (Fig. 4). In several templates for examine the influence of N addition on coding junction which no terminal DNA homology was possible (Fig. 3), formation without preexisting DNA homologies. coding junctions were efficiently produced in the terminal Coding joints of pG1JC1o-Cd and pC1JG1O-Cd rearrange- regions without any DNA homology. In addition, the ability ments in 18-8 cells were examined (Fig. 6). First, N regions to introduce N regions into these nonhomologous coding of one to three nucleotides were added in only 9 of 44 events ends during joining does not facilitate a greater recombina- scored, suggesting that N addition was not a prerequisite for tion frequency with nonhomologous coding end substrates facilitating joining in these templates without coding se- (Fig. 6). quence DNA homology. Second, recombination frequencies Homopolymer coding ends may be subject to unusual base for pG1JC1o-Cd and pC1JG1o-Cd substrates in the TdT+ stacking or strand dissociation in the process of coding = and junction resolution. These effects might not be observed pre-B-cell line (pG10/C1o-Cd [Rav 9] pC1JG1o-Cd [Ray when coding ends of mixed nucleotide composition are used. = 1.1]) were lower than those of pR1/R2-Cd and pJH290 These effects may contribute to reduced recombination controls, consistent with the results for HDR37A cells efficiencies observed for homopolymer substrates relative to (Table 2). These values do not support any enhancement of mixed nucleotide composition substrates, such as pJH290 or product formation by allowing N-region addition. Interest- pR1/R2-Cd. Thus, to look for the effects of DNA homology ingly, we found that N-region addition did not influence the on recombination efficiency, we could compare substrates of disparity in junctional deletions observed above for G/C the same or similar G/C content that would have the same coding ends. Fifty percent (11 of 22) of G coding ends and base stacking effects. The recombination frequencies of 77% of of C ends were undeleted in (17 22) coding pG1J pG1JG10-Cd (Rav = 8) and pC10C10-Cd (Rav = 9) compared C1o-Cd (Fig. 6). Also, 55% (12 of 22) of C coding ends and with those of pG1JC10-Cd (Rav = 12) and pC10G10-Cd (Ra, 32% (7 of 22) of G coding ends were undeleted in pC10/ = 5) indicate that extended homology does not increase the G1o-Cd rearrangements from 18-8 cells. Similarly, pC1J efficiency of the reaction. T5-Cd and pC1J/T1o-Cd substrates yielded somewhat re- Specific early Ig and TCR gene rearrangements in mice duced recombination frequencies compared with pR1/R2-Cd have been found to frequently utilize short homology in 18-8 cells despite the ability to introduce N nucleotides stretches in DJ junctions in one reading frame (6, 10, 11, 27). (data not shown). Therefore, these experiments support our Two recent studies showed that the utilization of DNA other findings with these substrates in HDR37A cells, show- overlap in certain developmentally early rearrangement ing diminished junctional deletions with G+C-rich and non- events was not due to cell selection (2, 18). Therefore, in homologous coding ends. N addition is unlikely to facilitate vivo, there appear to be conditions in which DNA overlap is more efficient joining. Instead, N addition probably func- selectively used during recombination in a programmed tions principally to program diversification of junctions to fashion. Particular V, D, and J coding sequences may be expand an immune repertoire in vivo. evolutionarily optimized in the terminal residues of the VOL. 13, 1993 V(D)J RECOMBINATION CODING JUNCTION FORMATION 6965 A. pGlo/ClO-Cd Coding-RSS(12) Coding-RSS(23) GGGGGGGGGG CCCCCCCCCC C12 P12 N P23 C23

7/22 GGGGGGGGGG CCCCCCCCCC 0 0 0 0 0 1/22 GGGGGGGGGG -CCCCCCCCC 0 0 0 0 -1 3/22 GGGGGGGGGG T --CCCCCCCC 0 0 1 0 -2 5/22 GGGGGGGGG- CCCCCCCCCC -1 0 0 0 0 1/22 GGGGGGGG-- A CCCCCCCCCC -2 0 1 0 0 1/22 GGGGGGGG-- CCCCCCCCCC -2 0 0 0 0 18-8 1/22 GGGGGGGG-- --CCCCCCCC -2 0 0 0 -2 1/22 GGGGGGG--- CCCCCCCCCC -3 0 0 0 0 1/22 GGGGGG---- TM CCCCCCCCCC -4 0 1 2 0 1/22 GGGGGG---- T CCCCCCCCCC -4 0 1 0 0

4/26 GGGGGGGGGG CCCCCCCCCC 0 0 0 0 5/26 GGGGGGGGGG -CCCCCCCCC 0 0 0 -1 1/26 GGGGGGGGGG --CCCCCCCC 0 0 0 -2 1/26 GGGGGGGGGG ---CCCCCCC 0 0 0 -3 1/26 GGGGGGGGGG ----CCCCCC 0 0 0 -4 3/26 GGGGGGGGG- CCCCCCCCCC -1 0 0 0 1/26 GGGGGGGGG- --CCCCCCCC -1 0 0 -2 HDR37A 2/26 GGGGGGGGG- ---CCCCCCC -1 0 0 -3 3/26 GGGGGGGG-- CCCCCCCCCC -2 0 0 0 2/26 GGGGGGGG-- --CCCCCCCC -2 0 0 -2 1/26 GGGGGGGG------CCCCCC -2 0 0 -4 1/26 GGGGGGG--- --CCCCCCCC -3 0 0 -2 1/26 GGGGGGG------CCCCCC -3 0 0 -4

B. pClo/Glo-Cd

CCCCCCCCCC GGGGGGGGGG C12 P12 N P23 C23

3/22 CCCCCCCCCC GGGGGGGGGG 0 0 0 0 0 1/22 CCCCCCCCCC fi GGGGGGGGGG 0 1 0 0 0 6/22 CCCCCCCCCC -GGGGGGGGG 0 0 0 0 -1 1/22 CCCCCCCCCC MA -GGGGGGGGG 0 2 1 0 -1 1/22 CCCCCCCCCC --GGGGGGGG 0 0 0 0 -2 1/22 CCCCCCCCC- ---GGGGGGG -1 0 0 0 -3 1/22 CCCCCCCC-- GGGGGGGGGG -2 0 0 0 0 18-8 1/22 CCCCCCC--- TCC -GGGGGGGGG -3 0 3 0 -1 2/22 CCCCCCC--- -GGGGGGGGG -3 0 0 0 -1 2/22 CCCCCCC--- --GGGGGGGG -3 0 0 0 -2 2/22 CCCCCC---- GGGGGGGGGG -4 0 0 0 0 1/22 CCCCCC------GGGGGGG -4 0 0 0 -3

1/12 CCCCCCCCCC GGGGGGGGGG 0 0 0 0 5/12 CCCCCCCCCC -GGGGGGGGG 0 0 0 -1 1/12 CCCCCCCCCC --GGGGGGGG 0 0 0 -2 1/12 CCCCCCC--- -GGGGGGGGG -3 0 0 -1 HDR37A 1/12 CCCCCCC------GGGGGGG -3 0 0 -3 1/12 CCCCCC------GGGGGGGG -4 0 0 -2 1/12 CCCC------GGGG -6 0 0 -6 1/12 CCC------GGGGGGG -7 0 0 -3

FIG. 6. Decreased junctional deletion in pG1WC1o-Cd and pC1WG1o-Cd V(D)J recombination substrates. Two coding junction plasmids were transfected into 18-8 or HDR37A cells, and coding junctions were evaluated. In pG1WC1o-Cd (A) and pC10/G1O-Cd (B), the two coding ends are nonhomologous. P nucleotides on one coding end can be homologous to the other coding end without P nucleotides. 18-8 cells are competent for N-nucleotide addition during V(D)J recombination. The extent of deletion and P-nucleotide retention are as in Fig. 2. coding ends to enhance their utilization when paired to- substrates has recently been shown to generate a higher gether. In our studies, we could not eliminate the possibility frequency of products using the homology (8). Ig DH-to-JH that >2-bp DNA overlap may be significant in joining and TCR -yb junctions where homology was profoundly without contributing to the overall frequency of product observed frequently consisted of >2 bp of overlap (11). In formation. Four-nucleotide homology in coding junction our data, extended homology (10 bp) found in the pG1O 6966 BOUBNOV ET AL. MOL. CELL. BIOL.

(12, 39) have shown that scid fibroblast lines have a poorer capacity to integrate DNA nonhomologously. However, Hairpins in DNA end joining as measured by recircularization of plas- synaptic complex mid DNA is normal for scid extracts in vitro (39) and during transient transfection (12). Thus, even though the scid gene product is clearly involved in V(D)J recombination and + double-strand break repair, scid also appears to have a role in some, but not all, types of nonhomologous recombination. By comparison, cell lines from the human DNA repair and - One Hairpin immunodeficiency disease Bloom syndrome have an associ- Processed ated ligase deficiency and high rates of sister chromatid exchange. Bloom syndrome cells have decreased nonhomol- ogous recombination levels with increased junctional dele- tions (35). In contrast, we have shown that the fidelity and frequency of V(D)J rearrangement in Bloom syndrome and DNA ligase I mutant cells are unchanged from normal (31). Second Hairpin A possible resolution of these dissimilarities in end joining Processed pathways may be that intermediate DNA end structures differ. In the cases of nonhomologous end joining reactions, restriction enzyme cleavages are used to form DNA termini with staggered or blunt free ends. For V(D)J recombination, it is possible that hairpin intermediates are cleaved to generate coding ends for processing. The structures of hairpins themselves or the resolved products of hairpins may present a unique set of structures that are not the same as restriction enzyme generated termini. Because V(D)J coding junction formation is a step that must occur after cleavage, constraints may also be imposed by the synaptic protein recombinase complex that could serve to channel the pro- Joining with Deletion Joining With Reduced cessing reactions into a particular pathway. Deletion Evidently coding junction formation can proceed quite FIG. 7. Processing of coding ends during V(D)J recombination. efficiently without coordinated junctional deletion (pG1J Coding junction formation is depicted as a multistep mechanism that C1o-Cd and pC10IG1O-Cd; Fig. 6). The lack of deletion may is distinct from the V(D)J recombination cleavage events. In this indicate an ability of undeleted strands from one coding end model, DNA hairpin intermediates are shown for each coding end. to join directly to deleted strands in the other coding end. Hairpins are resolved to generate a template for DNA processing This outcome may be stimulated to occur to a greater degree and joining. A model consistent with our data in which asymmetric when nucleolytic activity on G/C coding ends is limiting (Fig. processing occurs during the joining reaction is shown. 5) or by the presence of a protein complex holding the two coding ends together. In Xenopus oocytes, measurement of linearized plasmid DNA recircularization shows that pro- Glo-Cd and pC10C10-Cd substrates did not achieve an in- teins may hold ends together to allow gap filling across an creased recombination potential relative to substrates with- unligated junction (41). Although single-stranded ligation is out DNA homology (Table 1). Possibly such extended possible, the Xenopus data are more supportive of poly- homology could be more significant for the reaction effi- merase using a 3'-OH of one end as a primer and the other ciency in cells developing in the immune system. However, processed DNA end as a template. Either model is plausible this would argue that the in vivo state is more limited by the with V(D)J recombination. efficiency of coding junction synthesis than cleavage, which P nucleotides and reduced processing of coding ends. We seems unlikely. observed limitations to junctional deletions only with certain Nonhomologous end joining reactions in mammalian cells of the homopolymer substrates that we tested. G and C are phenotypically similar to V(D)J recombination. Several homopolymers matched with mixed nucleotide composition similarities between coding junction formation and nonho- (R) or another homopolymer (T) show elevated frequencies mologous DNA recombination events in eukaryotes have of P nucleotides attributable to the G and C coding ends (Fig. previously been noted, suggesting that these junctions may 3). In addition, when pRl/R2-Cd is examined, the G+C-rich be processed by related enzyme functions (32). Nonhomol- coding end is deleted less than the other random coding end ogous end joining is associated with junctional deletion and (Fig. 2). Although we did not show the presence of P addition of nucleotides, but end joining most frequently nucleotides in pG10/C1o-Cd and pC10/G1o-Cd junctions di- proceeds without any deletion of nucleotides (60%), even if rectly, a combined evaluation of all of our data argues that P incompatible restriction enzyme ends are being rejoined nucleotides are formed but not efficiently processed in these (34). A potential recruitment of short stretches of DNA substrates. Interestingly, a composition or DNA sequence homology has been described to explain end joining where bias has been correlated with P-nucleotide formation from substrate deletion ensues (34). Despite the common goal of previous studies with defined V(D)J substrates (28). rejoining DNA strands without strict requirements for ho- A limited nucleotide processing of these strands could be mology, several recent experiments argue that these pro- explained by two different means. First, DNA overlap might cesses may be distinctly different. One feature of nonhomol- enhance the retention of P nucleotides, either by their ability ogous recombination has been the integration frequency of to be directly ligated or by their ability to serve as a primer linearized plasmid DNA upon transfection. We and others for DNA polymerization of short gaps across the junction. VOL. 13, 1993 V(D)J RECOMBINATION CODING JUNCTION FORMATION 6967

Recently, it was suggested that the formation of P-nucleotide tion by DNA pairing. Further experiments will be needed to ends during coding junction processing would make avail- illuminate the importance of protein complexes in holding able ends for DNA overlap homology to be used for joint DNA together for V(D)J recombination. formation (23). We do not favor this interpretation because we have demonstrated with the nonhomologous coding end ACKNOWLEDGMENTS joining substrates that there is no DNA homology prefer- We thank Gene Oltz, Gary Rathbun, and Fred Alt for contribution ence. Therefore, it is difficult to argue that it is important in of the HDR37A cell line prior to publication. We thank John Petrini, one context but not another. A second explanation of our Gene Oltz, and Fred Alt for critical reading of the manuscript and data may be that there is a reduced digestion of coding ends helpful suggestions. We thank members of the Weaver laboratory with specific nucleotide compositions, particularly G and C for stimulating discussions and input into these experiments. homopolymers, and possibly any G+C-rich coding ends. This work was supported by NIH GM39312 and an ACS Junior Possibly a nuclease activity would be selectively constrained Faculty Research Award to D.T.. by nucleotide composition. Similarly, coding end composi- tion may be critical for hairpin resolution; A+T-rich hairpins REFERENCES may be expected to be considerably more flexible and open 1. Akira, S., . Okazaki, and H. Sakano. 1987. Two pairs of recombination signals are sufficient to cause immunoglobulin than G+C-rich hairpins. The elevated frequency of P nucle- V-(D)-J joining. Science 238:1134-1138. otides at G1o/Tlo (21%) and C1o/T5 junctions (33%) indicates 2. Asarnow, D. ., D. Cado, and D. H. Raulet. 1993. Selection is that the formation of P nucleotides is normal for V(D)J not required to produce invariant T-cell receptor --gene junc- recombination and probably that hairpin structures are the tional sequences. Nature (London) 362:158-160. most likely reaction intermediates. On the other hand, 3. Biedermann, K. A., J. Sun, A. J. Giaccia, L. M. Tosto, and J. M. particular coding end compositions could differentially influ- Brown. 1991. scid mutation in mice confers hypersensitivity to ence the stability or structure of coding end DNA-protein ionizing radiation and a deficiency in DNA double-strand break interactions and thereby influence the processing of the ends repair. Proc. Natl. Acad. Sci. USA 88:1394-1397. by a constitutive nuclease activity. 4. Boubnov, N. V., and D. T. Weaver. Unpublished data. 5. Boubnov, N. V., . P. Wills, and D. T. Weaver. Unpublished V(D)J junction formation may use protein-protein com- data. plexes rather than DNA homology. Our data are consistent 6. Feeney, A. J. 1992. Predominance of VH-D-JH junctions occur- with the resolution of the two coding ends as an asynchro- at sites of short sequence homology results in limited nous process (Fig. 7). In this model, the coding end that is junctional diversity in neonatal antibodies. J. Immunol. 149: resolved first will be processed more extensively than the 222-229. second one. Thus, if A+T-rich coding ends are a more easy 7. Fulop, G. M., and R. A. Phillips. 1990. The scid mutation in substrate for hairpin resolution, then more extensive pro- mice causes a general defect in DNA repair. Nature (London) cessing of these ends would occur there rather than at 347:479-482. G+C-rich ones during V(D)J recombination. An asynchro- 8. Gerstein, R. M., and M. R. Lieber. 1993. Extent to which homology can constrain coding exon junctional diversity in nous processing of the two coding ends may also explain our V(D)J recombination. Nature (London) 363:625-627. observations with substrates that can join by using P nucle- 9. Grant, S. G. N., J. Jessee, . R. Bloom, and D. Hanahan. 1990. otides. Many of the G1lClo and C1lGlo junctions have Differential plasmid rescue from transgenic mouse DNAs into undeleted coding ends so that joining could occur with the Escherichia coli methylation-restriction mutants. Proc. Natl. retained P nucleotides on one strand in the presence of Acad. Sci. USA 87:4645-4649. processing of the other strand. If G+C-rich sequences 10. Gu, H., I. Foster, and K. Rajewsky. 1990. Sequence homologies, inhibit processing, then the coding ends will be more likely N sequence insertion and JH gene utilization in VHDJH joining: to join by using terminal DNA homology in the P nucle- implications for the joining mechanism and the ontogenetic otides. timing of Lyl B cell and B-CLL progenitor generation. EMBO do not for J. 9:2133-2140. Our experiments provide any direct evidence 11. Gu, H., D. Kitamura, and K. Rajewsky. 1991. B cell develop- protein-protein interactions in V(D)J recombination coding ment regulated by gene rearrangement: arrest of maturation by junction resolution. However, in the absence of any neces- membrane-bound DF protein and selection of DH element sity for DNA pairing at short homology regions, the persis- reading frames. Cell 65:47-54. tence of a synaptic protein complex to facilitate coding 12. Harrington, J., C.-L. Hsieh, J. Gerton, G. Bosma, and M. R. junction formation could stimulate joining by anchoring Lieber. 1992. Analysis of the defect in DNA end joining in the DNA ends together (Fig. 7). Two additional mammalian cell murine scid mutation. Mol. Cell. Biol. 12:4758-4768. mutants (XR1 and xrs6) in which mutant effects on both 13. Hendrickson, . A., X.-. Qin, E. A. Bump, D. G. Schatz, M. V(D)J recombination signal and coding junctions are ob- Oettinger, and D. T. Weaver. 1991. A link between double- strand break-related repair and V(D)J recombination: the scid served have recently been described (30, 40). In these mutation. Proc. Natl. Acad. Sci. USA 88:40614065. studies, decreased recombination frequencies and aberrant 14. Hendrickson, E. A., D. G. Schatz, and D. T. Weaver. 1988. The sizes of coding and signal junction deletions are observed, scid gene encodes a trans-acting factor that mediates the rejoin- making it unlikely that the errors are introduced as a result of ing event of Ig gene rearrangement. Genes Dev. 2:817-829. faulty initiation steps or that the normal roles of the proteins 15. Hendrickson, E. A., M. S. Schlissel, and D. T. Weaver. 1990. are in processing functions. An alternative function for these Wild-type V(D)J recombination in scid pre-B cells. Mol. Cell. gene products would be the involvement in protein-protein Biol. 10:5397-5407. interactions that stimulate product formation. Although we 16. Hesse, J. E., M. R. Lieber, M. Gellert, and K. Mizuuchi. 1987. cannot argue that all of the nucleolytic and DNA polymer- Extrachromosomal DNA substrates in pre-B cells undergo in reaction are inversion or deletion at immunoglobulin V-(D)-J joining signals. ization functions used the recombination Cell 49:775-783. functioning during the presence of a synaptic complex, it 17. Hesse, J. E., M. R. Lieber, K. Mizuuchi, and M. Gellert. 1989. stands to reason that the central feature of end deletion for V(D)J recombination: a functional definition of the joining coding joints may be built into the synaptic complex func- signals. Genes Dev. 3:1053-1061. tion. Under these circumstances, junction formation in a 18. Itohara, S., P. Mombaerts, J. Lafaille, J. Iacomini, A. Nelson, protein complex is unrestricted by a necessity for stabiliza- A. R. Clarke, M. L. Hooper, A. Farr, and S. Tonegawa. 1993. T 6968 BOUBNOV ET AL. MOL. CELL. BIOL.

cell receptor 8 gene mutant mice: independent generation of ac ible B-cell line: role of transcriptional enhancer elements in T cells and programmed rearrangements of -y8 TCR genes. Cell directing V(D)J recombination. Mol. Cell. Biol. 13:6223-6230. 72:337-348. 30. Pergola, F., M. Z. Zdzienicka, and M. R. Lieber. 1993. V(D)J 19. Komori, T., A. Okada, V. Stewart, and F. Alt. 1993. Lack of N recombination in mammalian cell mutants defective in DNA regions in antigen receptor variable region genes of TdT- double-strand break repair. Mol. Cell. Biol. 13:3464-3471. deficient lymphocytes. Science 261:1171-1175. 31. Petrini, J. H. J., J. W. Donovan, C. DiMare, and D. T. Weaver. 20. Lafaille, J. J., A. DeCloux, M. Bonneville, Y. Takagaki, and S. Normal V(D)J coding junction formation in DNA ligase I Tonegawa. 1989. Junctional sequences of T cell receptor yb deficiency syndromes. Submitted for publication. genes: implications for -yb T cell lineages and for a novel 32. Roth, D. B., X.-B. Chang, and J. H. Wilson. 1989. Comparison intermediate of V-(D)-J joining. Cell 59:859-870. of filler DNA at immune, nonimmune, and oncogenic rearrange- 21. Landau, N. R., D. G. Schatz, M. Rosa, and D. Baltimore. 1987. ments suggests multiple mechanisms of formation. Mol. Cell. Increased frequency of N-region insertion in a murine pre-B cell Biol. 9:3049-3057. line infected with a terminal deoxynucleotidyl transferase ret- 33. Roth, D. B., J. P. Menetski, P. B. Nakajima, M. J. Bosma, and roviral expression vector. Mol. Cell. Biol. 7:3237-3243. M. Gellert. 1992. V(D)J recombination: broken DNA molecules 22. Lewis, S. M., J. E. Hesse, K. Mizuuchi, and M. Gellert. 1988. with covalently sealed (hairpin) coding ends in scid mouse Novel strand exchanges in V(D)J recombination. Cell 55:1099- thymocytes. Cell 70:983-991. 1107. 34. Roth, D. B., and J. H. Wilson. 1986. Nonhomologous recombi- 23. Lieber, M. R. 1991. Site-specific recombination in the immune nation in mammalian cells: role for short sequence homologies system. FASEB J. 5:2934-2944. in the joining reaction. Mol. Cell. Biol. 6:4295-4304. 24. Lieber, M. R., J. E. Hesse, S. Lewis, G. C. Bosma, N. Rosenberg, 35. Runger, T. M., and K. H. Kraemer. 1989. Joining of linear K. Mizuuchi, M. J. Bosma, and M. Gellert. 1988. The defect in plasmid DNA is reduced and error-prone in Bloom's syndrome murine severe combined immune deficiency: joining of signal cells. EMBO J. 8:1419-1425. sequences but not coding segments in V(D)J recombination. 36. Schuler, W., N. R. Ruetsch, M. Amsler, and M. J. Bosma. 1991. Cell 55:7-16. Coding joint formation of endogenous T cell receptor genes in 25. Malynn, B. A., T. K. Blackwell, G. M. Fulop, G. A. Rathbun, lymphoid cells from scid mice: unusual P-nucleotide additions in A. J. W. Furley, P. Ferrier, L. B. Heinke, R. A. Phillips, G. D. VJ-coding joints. Eur. J. Immunol. 21:589-596. Yancopoulos, and F. W. Alt. 1988. The scid defect affects the 37. Schuler, W., I. J. Weiler, A. Schuler, R. A. Phillips, N. Rosen- final step of the immunoglobulin VDJ recombinase mechanism. berg, T. W. Mak, J. F. Kearney, R. P. Perry, and M. J. Bosma. Cell 54:453-460. 1986. Rearrangement of antigen receptor genes is defective in 26. McCormack, W. T., L. W. Tjoelker, L. M. Carlson, B. Petry- mice with severe combined immune deficiency. Cell 46:963-972. niak, C. F. Barth, E. H. Humphries, and C. B. Thompson. 1989. 38. Sheehan, K. M., and M. R. Lieber. 1993. V(D)J recombination: Chicken IgL gene rearrangement involves deletion of a circular signal and coding joint resolution are uncoupled and depend on episome and addition of single nonrandom nucleotides to both parallel synapsis of the sites. Mol. Cell. Biol. 13:1363-1370. coding segments. Cell 56:785-791. 39. Staunton, J., and D. T. Weaver. Unpublished data. 27. Meek, K. 1990. Analysis of junctional diversity during B lym- 40. Taccioli, G. E., G. Rathbun, E. Oltz, T. Stamato, P. A. Jeggo, phocyte development. Science 250:820-823. and F. W. Alt. 1993. Impairment of V(D)J recombination in 28. Meier, J. T., and S. M. Lewis. 1993. P nucleotides in V(D)J double-strand break repair mutants. Science 260:207-210. recombination: a fine-structure analysis. Mol. Cell. Biol. 13: 41. Thode, S., A. Schafer, P. Pfeiffer, and W. Vielmetter. 1990. A 1078-1092. novel pathway of DNA end-to-end joining. Cell 60:921-928. 29. Oltz, E. M., F. W. Alt, W.-C. Lin, J. Chen, G. Taccioli, S. 42. Tonegawa, S. 1983. Somatic generation of antibody diversity. Desiderio, and G. Rathbun. 1993. A V(D)J recombinase-induc- Nature (London) 302:575-581.