A Rous Sarcoma Virus Provirus Is Flankedby Short Direct Repeats of A
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Natd Acad. Sci. USA Vol. 78, No. 7, pp. 4299-4303, July 1981 Biochemistry A Rous sarcoma virus provirus is flanked by short direct repeats of a cellular DNA sequence present in only one copy prior to integration (retroviruses/transposable elements/long terminal repeats/DNA sequences/integration sites) STEPHEN H. HUGHES*, ANN MUTSCHLER*, J. MICHAEL BISHOPt, AND HAROLD E. VARMUSt *Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724; and tDepartment of Microbiology and Immunology, University of California, San Francisco, California 94143 Contributed byJ. Michael Bishop, April 16, 1981 ABSTRACT The Rous sarcoma virus (RSV)-transformed rat forms ofcircular molecules. The smaller circle has a single LTR cell line RSV-NRK-2 contains a single complete RSV provirus. We and presumably is formed by homologous recombination be- have obtained recombinant A clones that contain both ends of the tween the LTRs. The larger circle has two copies of the LTR RSV provirus and the flanking rat sequences. The provirus is in- and presumably is formed by a ligation event that brings to- tegrated in unique DNA and is present in only one of the two ho- gether the ends ofthe linear viral DNA (3, 4). The nature ofthis mologous chromosomes. The rat sequences into which the RSV event is unknown, partly because the exact termini ofthe linear provirus integrated were also cloned from the RSV-NRK-2 cell DNA have not been defined. It is not known whether the linear line. The sequences of the regions involved in the recombination form or one ofthe circular forms ofDNA is the direct antecedent event have been determined and compared. Our data suggest of the integrated provirus. that, compared with the sequence of viral DNA in the large cir- is similar to that of cular form ofunintegrated viral DNA, the provirus lacks two base The structure of the integrated provirus pairs at each end and that the provirus is flanked by a six-base-pair unintegrated linear RSV DNA, with cellular DNAjoined to viral direct repeat of cellular DNA. This six-base-pair repeat was ap- DNA at or near the ends of the LTR (1, 2). The structures of parently created during the integration event because this se- viral RNA, the three forms ofunintegrated viral DNA, and the quence was present only once at the integration site before the provirus are given in Fig. 1. provirus was inserted. A survey of eight other independent RSV To learn more about the integration ofRSV DNA, we cloned transformed rat cell lines demonstrates that, in agreement with segments ofDNA that included thejunctions between viral and earlier results, the RSV proviruses have entered different seg- host DNA and compared these to the unaltered host sequences ments of rat cell DNA. We have also determined the sequence of into which the viral DNA integrated. These clones were derived a second virus DNA-host cell DNA junction from a second RSV- from the DNA ofthe transformed rat cell line RSV NRK-2 which transformed rat cell line (RSV-NRK4) and find that there are no contains a single complete RSV provirus (1). obvious similarities between the two integration sites or between The ends of the integrated provirus are flanked by a 6-bp the integration sites and the termini of viral DNA. repeat ofcellular DNA. This repeat is apparently created by the integration event because this 6-bp sequence is present only Retroviruses insert a DNA copy oftheir genome into host DNA once in the host DNA into which the provirus integrated. The as an obligate step in their life cycle. This recombination event provirus appears to have lost 2 bp from the right and left LTRs, is known to be relatively specific with respect to the site in viral based on comparisons with the sequence ofcloned circular viral DNA and relatively nonspecific with respect to the site in the DNA (8). However, there are bases at the junction between host DNA (1, 2). The retrovirus genome is RNA; before inte- cellular and viral DNA whose origin we cannot determine un- gration, the viral RNA-dependent DNA polymerase must copy ambiguously because the same base pairs exist at the corre- the RNA genome into DNA. The first stable intermediate is a sponding sites in viral and rat DNA. It therefore is possible that double-stranded linear DNA molecule made in the cytoplasm our estimate of the number of base pairs contributed to the 6- of infected cells. This viral DNA, which has a full-length (-) bp repeat by cellular DNA is wrong by 1. Ifthis were true, there strand and segmented (+) strands, is slightly larger than the would be a corresponding change of 1 bp in the number ofbase RNA from which it is derived, and sequences from both the 3' pairs lost from one LTR. We also determined the sequence of and 5' ends ofthe genomic RNA form a direct repeat (called the ajunction between virus and host cell DNA from a second RSV- LTR) at the termini ofthe linear DNA which varies in size from transformed rat cell line (RSV-NRK-4) and found that there are one species ofretrovirus to another. Each repeat has the struc- no obvious similarities between the two integration sites. ture U3RU5; U3 is a unique sequence from the 3' end of viral Integration of RSV DNA is similar to the integration of the RNA and U5 is a unique sequence from the 5' end ofviral RNA DNAs of other retroviruses, murine sarcoma virus, spleen ne- (1, 3, 4) (see Fig. 1). The R sequences are repeated at both the crosis virus, and mouse mammary tumor virus, which lose 2 bp 5' and 3' ends of viral RNA. In the case of Rous sarcoma virus at each end during integration. These three proviruses are (RSV), U3 is 230 base pairs (bp), R is 21 bp, and U5 is 80 bp (5- flanked by repeats ofcellular DNA of4, 5, or 6 bp, respectively 8). (9-11). The product of integrative recombination of retrovirus The linear viral DNA migrates from the cytoplasm of an in- DNA resembles the product of integrative recombination of fected cell to the nucleus where it is converted to two major bacterial or yeast transposons (12-16), the bacterial virus Mu, (17-19), and the Drosophila movable elements such as copia The publication costs ofthis article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertise- Abbreviations: LTR, long terminal repeat; RSV, Rous sarcoma virus; ment" in accordance with 18 U. S. C. §1734 solely to indicate this fact. VP, base pair(s); kb, kilobase(s). 4299 Downloaded by guest on October 1, 2021 4300 Biochemistry: Hughes et al. Proc. Natl. Acad. Sci. USA 78 (1981) R U5 U3 R of the desired fragments. The individual A recombinants were /ld I lAn digested with EcoRI and the insert was identified by hybrid- ization. In all cases the cloned EcoRI fragment was exactly the U3 R U5 1I U3 R U5 same size as the corresponding fragment in the digest of ge- I rI t--I= = nomic DNA and contained the predicted restriction sites. The EcoRI fragments were subcloned into pBR322. The U3 R U5 i \ U3 R U5 U3 R U5 EcoRI-digested pBR322 was treated with an excess of calf in- testinal alkaline phosphatase (Boehringer) to reduce the fre- quency ofself-ligation. All pBR322 derivatives were cloned and propagated in the recA- host HB1O1. All procedures involving LTR LTR recombinant DNA were performed in accordance with the Na- tional Institutes of U3 R U5 U3 R U5 Health guidelines. Sequence Determination. DNA fragments were isolated on agarose or acrylamide gels and purified by chromatography on I I TIt DEAE-Sephacryl. The purified fragments either were labeled at their 3' ends with the Klenow fragment ofE coli polymerase FIG. 1. Intermediates in replication of RSV genome. The RNA I (Boehringer) or were treated with calfintestinal alkaline phos- genome of RSV (Top) is approximately 9.5 kilobases long. The 3' end phatase (Boehringer) and labeled at their 5' ends with of the RNA is polyadenylylated (An), and a 21-base sequence (R) ad- polynu- jacent to the poly (A) is also present at the extreme 5' end of the RNA. cleotide kinase (PL Biochemicals). The complementary strands A unique stretch of 230 bases adjacent to Ratthe 3' endofthe molecule ofthe labeled fragments were separated or the fragments were is called U3; a unique stretch of 80 bases adjacent to R at the 5' end cut again with a second restriction endonuclease. The asym- of the genome is called U5. When the RNA genome is copied into DNA, metrically labeled fragments were chromatographed on DEAE- the U3 and U5 sequences are found at both ends of the linear DNA, and Sephacryl after electrophoretic purification. The DNA se- the sequence U3RU5 is the LTR. The linear molecule is synthesized in quence was determined the chemical methods Maxam the cytoplasm and migrates to the nucleus when it is the precursor to by of and two circular forms, one with one copy and the other with two copies of Gilbert (22). the LTR. One of these three molecules (the linear form or one of the Southern Filter Hybridization. The conditions for complete two circular forms) is probably the direct precursor of the provirus digestion ofcellular DNA, electrophoresis ofthe resulting DNA (Bottom), in which the ends of the LTRs are now attached to cellular fragments, transfer to nitrocellulose, and hybridization with 32P DNA. The arrows under the provirus indicate the EcoRI sites in the labeled DNA probes have been described (1, 3).