Site-specific integration of adeno-associated virus involves partial duplication of the target locus

Els Henckaertsa, Nathalie Dutheila, Nadja Zeltnerb, Steven Kattmanc, Erik Kohlbrennerb, Peter Wardd, Nathalie Cle´ mentb, Patricia Rebollob, Marion Kennedyc, Gordon M. Kellerc, and R. Michael Lindena,b,1

aDepartment of Infectious Diseases, King’s College London School of Medicine, London SE1 9RT, United Kingdom; Departments of bGene and Cell Medicine and dMedicine, Mount Sinai School of Medicine, One Gustave L. Levy Place, New York, NY 10029; and cMcEwen Centre for Regenerative Medicine, University Health Network, Toronto, ON, Canada M5G 1L7

Edited by Kenneth I. Berns, University of Florida College of Medicine, Gainesville, FL, and approved March 9, 2009 (received for review July 21, 2008) A variety of viruses establish latency by integrating their genome identified within AAVS1 (14). The MBS85 protein is thought to be into the host genome. The integration event generally occurs in a involved in the regulation of actin–myosin fiber assembly, and its nonspecific manner, precluding the prediction of functional con- translation initiation start codon is located only 17 nt downstream sequences from resulting disruptions of affected host . The of the RBS (14, 15). The AAVS1 locus is also closely linked to the nonpathogenic adeno-associated virus (AAV) is unique in its ability muscle-specific genes TNNT1, encoding slow skeletal muscle tro- to stably integrate in a site-specific manner into the human MBS85 ponin T, and TNNI3, encoding cardiac troponin I (16). . To gain a better understanding of the integration mechanism The fact that AAVS1 is located in a highly gene-dense region and and the consequences of MBS85 disruption, we analyzed the that virtually all viral–cellular junctions are found within MBS85 molecular structure of AAV integrants in various latently infected highlights the potential complexity of the integration mechanism human cell lines. Our study led to the observation that AAV and raises the question about the possible consequences of AAV integration causes an extensive but partial duplication of the latency (i.e., MBS85 disruption). With the help of an extensive target gene. Intriguingly, the molecular organization of the inte- library of previously identified viral–cellular junctions, it has be- grant leaves the possibility that a functional copy of the disrupted come clear that most of the integration sites characterized to date target gene could potentially be preserved despite the resulting lie within the first exon and intron of MBS85, possibly leaving 1 rearrangements. A latently infected, Mbs85-targeted mouse ES cell allele undisrupted (summarized in ref. 17). Besides these observa- MICROBIOLOGY line was generated to study the functional consequences of the tions, many aspects of this unique viral strategy remain elusive. observed duplication-based integration mechanism. AAV-modified In this study, we investigated the AAV insertion profile in various ES cell lines continued to self-renew, maintained their multilineage latently infected human cell lines and observed that AAV integrates differentiation potential and contributed successfully to mouse via a mechanism that includes the partial duplication of the target development when injected into blastocysts. Thus, our study locus, potentially preserving a functional copy of the disrupted reveals a viral strategy for targeted genome addition with the target gene. We took advantage of our previous observation that apparent absence of functional consequences. the AAVS1 locus is conserved in the mouse (15) and generated a latently infected mouse ES cell line to study the functional conse- embryonic stem cells ͉ MBS85 ͉ gene targeting ͉ Rep ͉ AAVS1 quences of the observed duplication-based site-specific integration event. AAV-modified ES cell lines continued to self-renew, main- ild-type adeno-associated virus has adopted a lifestyle that is tained their multilineage differentiation potential, and contributed Wunique among eukaryotic viruses. This nonautonomous successfully to mouse development when injected into blastocysts. parvovirus has evolved to efficiently replicate in cells that have been Based on our findings, we propose a mechanism that could explain infected with helper viruses (e.g., adeno- or herpesviruses) (1). In how this nonpathogenic virus can integrate into one of the most the absence of helper virus infection, AAV can establish latency densely populated regions within the in the absence through site-specific genome integration into human of apparent deleterious effects. 19 at 19q13.42 (2, 3). It is well established that AAV-mediated site-specific integration Results requires the AAV Rep78/68 proteins in trans (4, 5), a cis-acting viral Identification of the Viral–Cellular Junctions in Various AAV-Infected DNA sequence, which consists of a tetranucleotide repeat called Human Cell Lines. Latently infected HeLa cell lines were generated the Rep-binding site (RBS) (6) and a 33-nt cellular sequence by using standard procedures. The junctions between the left ITR present at the integration site, termed AAVS1 (7). This sequence of AAV and cellular DNA were identified by linker-mediated (LM) consists of a RBS and terminal resolution site (TRS), 2 motifs that PCR technology (18). Because the endonuclease activity of Rep within the ITRs of AAV (6, 8) together serve as the replication initiates AAV-mediated targeted integration (19), we have arbi- origin (9). The large Rep proteins can simultaneously bind the trarily designated the second T residue of the MBS85 TRS motif cellular and viral RBS, suggesting a mechanism of site-specific (GGTTGG), as the nucleotide number 1. The left junction in the integration that is based on Rep-mediated tethering of the AAV genome to the AAVS1 sequence (10). The next step in the inte- gration process involves Rep-mediated site-specific nicking of the Author contributions: E.H., N.D., and R.M.L. designed research; E.H., N.D., N.Z., S.K., E.K., AAVS1 TRS, generating a free DNA 3Ј-OH and a covalent 5Ј and P.W. performed research; N.C., P.R., M.K., and G.M.K. contributed new reagents/ analytic tools; E.H., N.D., N.Z., S.K., E.K., P.W., M.K., G.M.K., and R.M.L. analyzed data; and DNA–Rep complex, similar to the initiation of AAV DNA repli- E.H. and R.M.L. wrote the paper. cation (11). The subsequent steps remain to be elucidated, although The authors declare no conflict of interest. the requirement of a functional AAV replication origin within This article is a PNAS Direct Submission. AAVS1 is indicative for the involvement of AAVS1 replication Freely available online through the PNAS open access option. (7). This Rep-induced replication is thought to be at the basis of 1To whom correspondence should be addressed at: Department of Infectious Diseases, the previously hypothesized amplification of the integration locus King’s College London School of Medicine, London SE1 9RT, United Kingdom. E-mail: (12, 13). [email protected]. Interestingly, a gene called protein phosphatase 1 regulatory in- This article contains supporting information online at www.pnas.org/cgi/content/full/ hibitor subunit 12C or MBS85 (myosin-binding subunit 85) was 0806821106/DCSupplemental.

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806821106 PNAS Early Edition ͉ 1of6 Downloaded by guest on September 23, 2021 Fig. 1. Molecular structure of site-specifically integrated wtAAV2 isolated from different latently infected human cell lines. Black boxes represent the first 3 exons of the MBS85 gene. The horizontal dashed line indicates that the sizes are not proportional. Vertical dashed lines indicate the junctions be- tween MBS85 and wtAAV2. White boxes represent the left (AAV-L) and right (AAV-R) side of integrated wtAAV2. Numbers indicate the nucleotide posi- tions of the junctions relative to the AAVS1 TRS motif.

first cell line, HeLa-T1, is located in MBS85 at a significant distance from the cellular TRS (12.9 kb) and the viral ITR is missing the first 86 nt [Fig. 1 and supporting information (SI) Fig. S1]. In HeLa-T2 cells, the left junction is located in MBS85 at 14 kb from the TRS, and the viral ITR has been deleted by the first 109 nt (Fig. 1 and Fig. S1). The junctions between cellular DNA and right AAV ITR, identified by direct PCR, are located in MBS85 at 13 and 4 nt downstream from the TRS motif in, respectively, HeLa-T1 and -T2 cells (Fig. 1). The right ITRs are also partially deleted (Fig. S1). In HeLa-T1 cells, we isolated a second left and right junction, at, Fig. 2. Site-specific integration of wtAAV2 in HeLa cells occurs through respectively, 12 and 0.3 kb downstream from the TRS (Fig. 1 and partial duplication of the target site. (A) The duplication hypothesis. Sizes of Fig. S1). Based on the observed positions of the left viral–cellular the restriction fragments that hybridize to the MBS85-specific (pRVK, forward junctions, we designed a direct PCR technique using different hatched box) and/or wtAAV2 (backward hatched box) probes are shown for forward primers located at 2-kb intervals in MBS85 combined with the disrupted (Top) and undisrupted allele (Bottom). Black and gray boxes a fixed reverse primer in AAV. Using this method, we identified the respectively represent MBS85 exons and integrated wtAAV2. The dashed line left junction in a third cell line, HeLa-T3, at 9.3 kb downstream indicates that the sizes of the introns are not proportional. (Center) A sche- from the TRS (Fig. 1). The right junction in this cell line is located matic representation of the PCRs showing duplication of MBS85 sequences. (B) Southern blot analysis of EcoRI-digested genomic DNA from HeLa and in MBS85 at 24 nt downstream from the TRS (Fig. 1). Both left and HeLa-T3 cells using the MBS85 or wtAAV2 probe. Cohybridization is indicated right ITRs of the provirus are partially deleted (Fig. S1). Analysis by the asterisks. Sizes are in kilobases. (C) Images of the long PCR products of the viral–cellular junctions in 2 previously established cell lines encompassing the left (Upper) and right (Lower) junction, used to show (4, 16) show a similar integration pattern with the left junctions duplication of MBS85 sequences. located far downstream from the cellular TRS motif, at 10.1 and 10.8 kb, whereas the corresponding right junctions were found in close proximity to this motif (Fig. 1 and Fig. S1). bands visible after hybridization of, respectively, the BamHI, EcoRI, and HindIII digests with the wtAAV2 probe represent a wtAAV2 Integrates Site-Specifically Through Partial Duplication of rearrangement within the provirus (Fig. 2B and Fig. S3). Southern MBS85 Sequences. The molecular characterization of 6 different blot analysis using AAV noncutters did not show any evidence for integrants present in AAVS1 revealed that, as reported for many an additional randomly integrated copy of AAV. viral–cellular junctions, microhomologies or insertions of unknown To further confirm partial duplication of the MBS85 allele, we sequences can be observed at the breakpoints. The previously performed 2 PCRs that were designed to, respectively, amplify unobserved positions of the cellular breakpoints were intriguing, genomic DNA from the third intron of MBS85 to the wtAAV2 Rep i.e., Ͼ9 kb downstream from the cellular TRS motif (left junction) gene (4.5 kb) and from the wtAAV2 Cap gene to the third intron and within the 5Ј UTR or exon1 of MBS85 (right junction). We were of MBS85 (7 kb) (Fig. 2 A and C). The 1-kb overlapping region also struck by the 5Ј–3Ј directionality of the cellular DNA present present in both PCR products provides strong evidence that part of at both junctions. These observations led us to hypothesize that the MBS85 allele had been duplicated during the integration partial duplication of MBS85 is an integral part of the integration process. mechanism (Fig. 2A). The MBS85 duplication was also demonstrated by Southern blot To test this hypothesis, we confirmed the junctions observed in analysis in HeLa-T1 cells (Fig. S4). In addition to the bands, which the HeLa-T3 cell line with 2 different primer sets and performed represent the duplication-inducing site-specific integration of Southern blot analysis using 3 different restriction enzymes (for wtAAV2, there are bands that can be explained by additional restriction diagram, see Fig. S2). As predicted from the proposed Rep-induced AAVS1 rearrangements. The 11- and 5.3-kb bands, duplication model (Fig. 2A), hybridization of EcoRI-digested DNA visible after hybridization of, respectively, the HindIII and BamHI with the MBS85 probe generated a 8.2-kb band corresponding to digests with the MBS85 probe, represent a junction between the undisrupted allele and an additional band at 10.4 kb, repre- nucleotide 11,469 and nucleotide 1 of MBS85 as shown by PCR senting the disrupted allele (Fig. 2B Left). As expected, the 10.4-kb (Fig. S4). This junction corresponds to a 8.3-kb band in the EcoRI band, which contains the right junction, cohybridized with the digest and thus comigrates with the parental band. This observation wtAAV2 probe (Fig. 2B Right). Similar results were obtained with shows that at least some of the additional AAVS1 rearrangements Southern blot analysis of BamHI- and HindIII-digested DNA (Fig. can be explained by Rep-mediated MBS85 duplication in the S3). In all 3 digests, hybridization with the wtAAV2 probe did not absence of viral DNA integration. Similar to what we observed in reveal the left junction-containing band as it binds to a smaller HeLa-T3 cells, the 6-kb bands, visible after hybridization of the portion of the probe. The strongly hybridizing bands at 4.7 kb EcoRI and BamHI digests with the wtAAV2 probe, represent a represent AAV head-to-tail concatemers. The 11-, 5-, and 10-kb rearrangement within the provirus (Fig. S4).

2of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806821106 Henckaerts et al. Downloaded by guest on September 23, 2021 Together, these data demonstrate that partial duplication of the MBS85 locus occurs in latently infected HeLa cells as a result of Rep-mediated site-specific integration.

rAAV2 Integrates Site-Specifically in Mouse ES Cells Through Partial Duplication of Mbs85 Sequences. To investigate whether the mouse could be used as a model system to study the consequences of the observed site-specific integration mechanism, we initiated an inte- gration assay based on coinfection of mouse ES cells. CCE cells were coinfected with recombinant AAV2 (rAAV2; GFP and neomycin resistance genes) and wtAAV2 (providing rep) viruses. Given the low infectivity of mouse ES cells (2% at a MOI of 106 gcp per cell), the observed frequency of site-specific integration was expected to be very low (2% of GFP-positive cells are expected to also have been infected with wtAAV). Assuming an integration frequency of Ͻ10%, the likelihood of identifying a targeted inte- grant was 1 in 500 clones analyzed. Southern blot analysis using an Mbs85-specific probe on HindIII (rAAV2 noncutter)-digested DNA isolated from G418-resistant clones demonstrated that 1 of the 106 clones analyzed, showed an additional band to the expected undisrupted 7.1-kb band (Figs. S2 and S5). The additional band comigrated with the recombinant vector DNA, suggesting site- specific integration of 1 rAAV2 molecule into Mbs85. Southern blot analysis and quantitative real-time PCR were negative for wtAAV2 rep and cap sequences, indicating that wtAAV2 integration did not occur in this clone. In this latently infected mouse ES cell line, designated CCE-T, the junction between the left AAV ITR and

cellular DNA was located in Mbs85, at 8.5 kb downstream from the MICROBIOLOGY TRS motif (Fig. 3A). The junction with the right ITR was found in the promoter region of the Mbs85 gene. At the viral breakpoints, Fig. 3. Site-specific integration of rAAV2 in mouse ES cells occurs through both left and right ITRs were missing 50 nt. Interestingly, insertion partial duplication of the target site. (A) Schematic representation of the of an unknown 18-nt sequence was observed at the right viral– molecular structure of site-specifically integrated rAAV2 in mouse ES cells. cellular junction. Comparison of the left and right junctions re- Black boxes represent the first 3 exons of the Mbs85 gene. Vertical dashed lines vealed that this 18-nt sequence is the reverse complement of the indicate the junctions between Mbs85 and rAAV2. White boxes represent the Mbs85 sequence present at the left junction (Fig. S1). left (AAV-L) and right (AAV-R) side of integrated rAAV2. Numbers indicate the Together, these data show that the molecular structure of the nucleotide positions of the junctions relative to the AAVS1 TRS motif. (B) The AAV integrant in mouse ES cells is very similar to the structure duplication hypothesis. The sizes of the restriction fragments that hybridize to the Mbs85-specific (forward hatched box) and/or rAAV2 (backward hatched observed in latently infected HeLa cells, suggesting that partial box) probes are shown for the disrupted (Top) and undisrupted allele (Bot- duplication of Mbs85 has occurred as a result of Rep-mediated tom). Black and gray boxes, respectively, represent Mbs85 exons and inte- site-specific integration of recombinant AAV2. grated rAAV2. The dashed line indicates that the sizes of the introns are not The duplication in CCE-T cells was further demonstrated by proportional. (Center) A schematic representation of the PCRs showing du- Southern blot analysis and PCR. As predicted from the proposed plication of Mbs85 sequences. (C) Southern blot analysis of EcoRI-digested duplication model (Fig. 3B), hybridization of EcoRI-digested DNA genomic DNA from CCE and CCE-T cells using the Mbs85 or rAAV2 probe. with the Mbs85 probe generated a 13-kb band corresponding to the Cohybridization is indicated by the asterisks. Sizes are in kilobases. (D) Images undisrupted allele, and 2 bands at 12- and 11.1-kb, representing the of the long PCR products encompassing the left and right junction, used to disrupted allele (Fig. 3C Left). As expected, the 11.1-kb band show duplication of Mbs85 sequences. cohybridized with the rAAV2 probe (Fig. 3C Right). Similar results were obtained with Southern blot analysis of BamHI- and HindIII- 4A, Mbs85 mRNA levels were similar in both parental and targeted digested DNA (Fig. S5). To further confirm partial duplication of mouse ES cell lines. Northern blot analysis using an exon1–5 probe the Mbs85 allele, we performed 2 nested PCRs that were designed did not show any truncated or aberrant transcripts transcribed from to respectively amplify genomic DNA from the first intron of Mbs85 the undisrupted promoter upstream from the left viral-cellular to the rAAV2 CAG promoter (5.2 kb) and from the rAAV2 bGH junction. Nevertheless, we cannot completely rule out that unstable polyadenylation site to the third exon of Mbs85 (4.9 kb) (Fig. 3 B transcripts are generated from this promoter. As expected from and D). The 688-bp overlapping region present in both PCR products provides strong evidence that part of the Mbs85 allele had Southern blot analysis showing intact Tnnt1 and Tnni3 genes (Fig. been duplicated during the integration process. S6), quantitative real-time RT-PCR data indicated that expression levels of these genes were not significantly different from those AAV Site-Specific Integration Does Not Affect Gene Expression of the observed in the parental cell line (Fig. 4A). Altogether, our data Target Locus. Given that the molecular structure of the integrant demonstrate that expression levels of Mbs85 or any of the Mbs85- present in the latently infected mouse ES line is highly represen- linked genes were not influenced by the integration event. tative of all human clones analyzed to date, this cell line can serve as a diploid model system to study the consequences of AAV AAV Site-Specific Integration Does Not Affect the in Vitro and in Vivo site-specific integration and the resulting partial MBS85 duplica- Multilineage Differentiation Capacity of ES Cells. Finally, we took tion. We first investigated whether site-specific integration inter- advantage of the features of ES cells to investigate whether cells that fered with MBS85 transcriptional activity. Quantitative real-time have been subjected to duplication-based site-specific integration of RT-PCR experiments were designed to detect expression driven by rAAV2 retain full functionality. First, we showed that latently the Mbs85 promoters located on the undisrupted allele and down- infected mouse ES cells displayed normal, alkaline phosphatase- stream of the right junction of the disrupted allele. As shown in Fig. positive morphology (Fig. 4B) and that expression of the ES

Henckaerts et al. PNAS Early Edition ͉ 3of6 Downloaded by guest on September 23, 2021 Fig. 4. Site-specific integration-induced duplication of Mbs85 sequences does not alter Mbs85 expression levels and leaves the multilineage in vitro differentiation capacity of ES cells unchanged. (A) Fold change in Mbs85, Tnnt1, and Tnni3 expression levels in CCE-T relative to the control, CCE, as determined Ϫ⌬⌬C by real-time PCR and the 2 T method. (B) Alkaline phosphatase staining of CCE-T cells. (C) Expression levels of Nanog, Oct4, Rex1, and ␤-actin in CCE and CCE-T cells as determined by RT-PCR. (D) Fluorescence image of d4 EBs. (E) Flow cytometry on single-cell suspensions prepared from targeted d4 EBs shows expression levels of c-kit and Flk1 indicative of typical differentiation. (F–H) Bright-field (Left) and corresponding fluorescence (Right) images of, respectively, a cardiomyocyte cluster (see also Movie S1), a blast colony, and neurons derived from CCE-T cells. Additional Tuj1 staining is shown in H Right.

cell-specific markers Oct4, Nanog, and Rex1 was comparable with dissect the mechanism of AAV site-specific integration and to expression observed in the undisrupted parental cell line (Fig. 4C). address possible effects of insertional gene disruption, we took Next, we determined whether site-specific integration of rAAV2 advantage of the presence of a mouse Mbs85 ortholog and of the had altered the differentiation properties of mouse ES cells. Em- fact that the human and mouse AAVS1-linked genes display the bryoid bodies (EB) derived from targeted ES cells formed in a same overall chromosomal organization (15). In the present study, timely manner and had a normal size and morphology (Fig. 4D); we analyzed the molecular structure of AAV integrants in several VEGF receptor 2 (Flk-1) and c-kit expression levels were indicative of typical differentiation patterns (Fig. 4E) (20). Because Mbs85 is closely linked to the muscle-specific genes Tnnt1 and Tnni3,itwas of particular interest to determine the cardiomyocyte potential of AAVS1-targeted mouse ES cells. Under the appropriate growth conditions, CCE-T-derived D4 EB cells were able to differentiate into contracting cardiomyocytes (Fig. 4F and Movie S1). Hemato- poietic and endothelial potential were tested by using the blast colony-forming assay, which supports the growth of the hemangio- blast (21). This assay showed that blast colonies developed3dafter initiation of the assay as expected from studies with parental cell lines (Fig. 4G). Finally, by using a neuronal differentiation assay, targeted ES cells differentiated into neurons as confirmed by expression of the neuron-specific marker Tuj1 (Fig. 4H). As can be seen in Fig. 4 F Right to H Right, GFP expression remained robust throughout differentiation. Together, our data suggest that site- specific integration of rAAV2 into Mbs85 does not interfere with multilineage in vitro differentiation of ES cells. To determine whether Mbs85-targeted mouse ES cells can contribute to all lineages in vivo and whether transgene expression can be sustained through extensive in vivo proliferation and dif- ferentiation, we injected the CCE-T cells into blastocysts. The resulting chimeric animals were killed to analyze morphology and GFP expression in various tissues. Macroscopic analysis of all organs harvested did not reveal any abnormalities. Fig. 5 shows the analysis of heart, skeletal muscle, liver, kidney, and brain obtained from a control C57BL/6 mouse and a representative example of a chimeric mouse with contribution from Mbs85-targeted ES cells. In all tissues analyzed, we observed that GFP-expressing cells had contributed significantly to the development without disturbing the normal histology. Except for the brain, targeted cells displayed robust GFP expression. Overall, our in vivo studies highlight that site-specific integration of AAV does not appear to have adverse Fig. 5. Mouse ES cells subject to site-specific integration-induced duplication effects on the healthy development and maintenance of numerous of Mbs85 sequences contribute significantly to mouse development. (Left) tissues. Fluorescence images of glycol methacrylate (GMA) sections of the indicated organs harvested from a control C57BL/6 mouse. (Center) Fluorescence images Discussion of GMA sections of the indicated organs harvested from a chimeric mouse. The ability of wild-type AAV to integrate its genome into a specific (Right) Corresponding images of hematoxylin and eosin-stained GMA sections locus on is unique within eukaryotic systems. To of the chimeric tissues. All images are taken at a 20ϫ magnification.

4of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806821106 Henckaerts et al. Downloaded by guest on September 23, 2021 previously uncharacterized as well as previously established latently infected human cell lines and generated a mouse ES cell line carrying site-specifically integrated rAAV2, which bears the same characteristics as the wtAAV2 integrants present in AAVS1 in human cell lines. With respect to the AAV component of the junction, the ITRs are partially deleted, and the breakpoints are lying close to the viral RBS motif (3, 22). Other hallmarks are the microhomology between the viral and cellular sequences, the presence of a short ‘‘unknown’’ sequence (23, 24), and the close proximity of the right junction to the cellular TRS motif. It should be noted that most previous studies focused on the isolation of viral–cellular junctions based on direct PCR technologies that would only allow for the amplification of junctions located close to the AAVS1 TRS/RBS motifs (17, 24). Our analysis also revealed several striking additional features. First, the left viral-cellular junctions were all located unexpectedly far downstream from the cellular TRS/RBS motif, whereas the right junctions were found in close proximity to the TRS/RBS (Fig. 1). Few of the previously identified left junctions were found closer to Fig. 6. Model for site-specific integration of AAV. (I) Rep-mediated strand- the TRS/RBS motifs (22, 24, 25), suggesting that the left cellular specific nick at the TRS (or TRS-like structure) in AAVS1. (II) DNA synthesis breakpoint might occur anywhere downstream of the TRS motif, originating at the TRS (or TRS-like structure) resulting in strand displacement. most frequently at a considerable distance. In a recent study, Drew (III) Template strand switch onto AAV. (IV) Occasional second template strand et al. (26) identified 11 left viral–cellular junctions, of which the switch back onto AAVS1 generating an inverted repeat. (V) Ligation between cellular breakpoints were scattered from 2.3 kb to Ͼ14 kb from the the ‘‘unknown sequence,’’ or alternatively AAV, and the displaced strand. (V TRS motif (26). The second surprising feature is the 5Ј–3Ј direction and VI) Nick introduced at the bottom strand and DNA synthesis of the noncomplimentary strand. (VII) AAV site-specific integration results in partial of the Mbs85 sequences adjacent to the left and right junctions. All duplication of MBS85 sequences. integrants were present in the same 5Ј–3Ј orientation as compared

with the MBS85 gene. Finally, comparison of the contiguous left MICROBIOLOGY and right junctions present in some of the MBS85-targeted HeLa map exactly to this nucleotide. It remains to be determined which and mouse ES cell lines demonstrated that the previously unknown process is involved in the removal of those nucleotides. In addition, sequences present at one of the viral-cellular junctions originated the right junction in the mouse ES cell line occurred 250 bp from the other junction. upstream from the previously identified TRS motif (15). Given that Altogether, these observations led us to hypothesize that AAV the sequence of the mouse TRS/RBS motif is somewhat different integrates by duplicating the upstream MBS85 sequences while from the human motif, it is possible that Rep introduced a nick at leaving the downstream sequences virtually unaltered. Southern an as yet unidentified TRS motif. (vi) A second nick generating a blot analysis and PCRs confirm that the proposed duplication- free 3Ј-OH terminus should occur on the template strand in order based integration event had indeed taken place in both human and for the DNA polymerase to fill in the newly generated AAV-AAVS1 mouse MBS85-targeted genomes, both with wild-type AAV and sequences. Alternatively, missing nucleotides are filled in during the recombinant AAV, where Rep had been provided in trans. These next round of replication. In Fig. 6 we propose a simplified model somewhat surprising characteristics provide insights into the mech- that outlines the possible steps of the mechanism of AAV site- anism of Rep-mediated site-specific integration and led us to extend specific integration. our model for AAV site-specific DNA integration. Our previous Our model implies that the process of integration is more precise model involved the binding of an oligomeric form of Rep to both than previously suggested. Importantly, in the mouse ES cell line, viral and cellular RBS, followed by a strand-specific nick at the TRS integration allows for the preservation of 2 functional Mbs85 alleles, in AAVS1 and DNA replication undergoing 3 consecutive strand preserving normal expression from the undisrupted allele as well as switches (27). This model is affected by our findings in that it now has to take into account the following previously undescribed from the Mbs85 allele that served as the integration target. Further properties. (i) The structure of the left and right junctions suggest experiments are still required to determine whether expression that the first recombination event involves the left side of the viral from the duplicated allele could lead to unstable or aberrant MBS85 genome, whereas the right junction might occur at a later stage mRNA products. Mouse ES cells carrying site-specifically inte- during the integration steps, thereby also defining the 5Ј–3Ј orien- grated rAAV2 performed equally well as their unmodified coun- tation of the integrant. It is possible that the viral p5 promoter (28), terparts in a series of stringent in vitro assays, providing evidence or an exogenous promoter in the case of rAAV, plays an important that the integration event did not have any discernable effects. role in the formation of the initial recombination complex. (ii) Moreover, these cells maintained their ability to fully participate in Replication and extension of the elongating strand is much more mouse development when injected into blastocysts in vivo. The extensive than previously thought. (iii) After extension, the repli- absence of any discernable effect as a result of AAV-mediated cation switches templates onto AAV, generating the junction with DNA integration into the densely populated AAVS1 might be the left ITR. (iv) The replication fork might switch back to the explained by the fact that through the observed duplication, a MBS85 sequences adjacent to the left junction, thus generating a functional promoter is preserved in front of an intact MBS85 gene, short sequence that can then be found at the right junction. (v) The downstream of the integrated exogenous DNA. Alternatively, the displaced strand, covalently bound to Rep by its 5Ј end, forms a observed duplication leaves the possibility that the viral and dupli- junction with the 3Ј end of the newly replicated strand. Although, cated cellular DNA can be spliced out, which in turn could restore we have no direct evidence for this ligation step, Rep has previously normal MBS85 expression. been demonstrated to catalyze the ligation of single-stranded AAV To our knowledge, AAV genome integration is the only example origin DNA substrates (29). It must be noted that this step would for targeted gene addition in the eukaryotic system that has evolved predict that the right junction is formed with the 5Ј end of the a strategy capable of avoiding adverse insertional mutagenesis. displaced strand, i.e., the nicking site within the TRS. However, Here, we put forward a possible mechanism that can explain this although our right junctions are in proximity of this site, they rarely unique phenomenon.

Henckaerts et al. PNAS Early Edition ͉ 5of6 Downloaded by guest on September 23, 2021 Materials and Methods Left Junction PCR. Primers and cycling conditions used to identify left viral– ES Cell Growth and Differentiation. Mouse ES cells (CCE) were maintained and cellular junctions by direct PCR are mentioned in Table S1. PCR products were differentiated following standard protocols. See SI Text. cloned into pCR2.1 and sequenced (pH24L, pH49L).

Production of Recombinant and Wild-Type AAV. rAAV2 and wtAAV2 were PCRs Showing Duplication. Primers and cycling conditions for p24LD and p24RD generated by using standard procedures. See SI Text. (HeLa-T3) and pcr1, p4LD, pcr2, and p4RD (CCE-T) are described in Table S1. The PCR products were cloned into the pCR2.1 vector and sequenced. Integration Assays. Mouse ES cells were coinfected with rAAV2 and wtAAV2 in feeder-free conditions at an MOI of 106 gcp per cell. At 48 h after infection, ES cells RT-PCR. The RNeasy Mini kit and RNase-free DNase Set (Qiagen) were used for all were harvested for flow cytometry and replated on neomyocin-resistant MEF. RNA extractions. Total RNA (1.5 ␮g) was reverse-transcribed with random hex- G418 selection (300 ␮g/mL) was started 24 h after plating. At 5 d after selection, amers using the Omniscript Reverse Transcription kit (Qiagen). Primers and G418-resistant clones were aspirated, trypsinized, and seeded in MEF-containing cycling conditions used to amplify ␤-actin, Rex-1, Oct4, and Nanog are described 24-well plates. The clones were expanded and harvested for flow cytometry and in Table S1. genomic DNA extraction. HeLa cells were maintained in standard conditions and infected with wtAAV2 Real-Time RT-PCR. Three RNA samples each were isolated from CCE-T and at an MOI of 104 gcp per cell. Cells were passaged 6 times before single-cell sorting parental cell line CCE. cDNA from each extraction was produced from 1 ␮g of total to dilute out the episomal AAV genomes. RNA with the Omniscript RT kit (Qiagen). cDNA was diluted 10-fold, and 3 replicates of each cDNA sample were used as template in real-time quantitative Flow Cytometry. rAAV2-infected cells were analyzed for GFP expression on a PCR. See SI Text. Facscalibur flow cytometer (Becton Dickinson). Cell sorting and single-cell depos- its were performed on a MoFlo flow cytometer (DAKO). See SI Text. Alkaline Phosphatase and Tuj1 Staining. CCE-T cells were stained by using the Vector Red Alkaline Phosphatase Substrate kit I (Vector Laboratories). Tuj1 stain- Southern Blot Analysis. The DNeasy Tissue kit (Qiagen) was used for all genomic ing of EB-derived neurons was carried out as previously described. See SI Text. DNA extractions. Southern blot analyses were performed as described previously Images were taken on a Leica DM IRB inverted microscope equipped with a digital (16). The Mbs85, GFP-Neo, and MBS85 probes were generated by PCR performed camera (Mintron). on plasmids containing the respective sequences. The following probes were generated by digestion: Tnni3, Tnnt1, Eps8l1 (all EcoRI, pCR2.1), wtAAV2 (BglII, Generation and Analysis of Chimeric Animals. Chimeric animals were generated Ϫ ϩ pAV2), rAAV2 (SmaI, pTRUF11), pRVK (EcoRI (nt 396) KpnI (nt 3179)]. See in the Mouse Genetics Shared Research Facility at Mount Sinai School of Medicine Table S1. by following standard procedures. GFP expression and morphology were ana- lyzed on sections of tissue blocs prepared by using the Technovit H8100 kit Right Junction PCR. Primers and cycling conditions used to identify the right (Kulzer; Electron Microscopy Sciences). Fluorescence images were acquired by viral–cellular junctions are mentioned in Table S1. PCR products were cloned into using a fluorescence microscope (DMRA2; Leica) and a digital CCD camera (model pCR2.1 (TOPO TA cloning kit, Invitrogen) and sequenced (pM4, pH19, pH24R, and ORCA-ER; Hamamatsu). Hematoxylin and eosin staining of the sections was pH49R). Similar conditions were used to identify the MBS85 duplication, which carried out by following standard procedures; images were acquired by using a occurred in the absence of AAV integration (p49MM1 and p49MM2). Leica DM LB equipped with a Spot digital camera (Diagnostic Instruments). Experiments and animal care were performed in accordance with the Mount Linker-Mediated PCR (LM-PCR). The integration site of rAAV2 in CCE-T cells was Sinai Institutional Animal Care and Use Committee. cloned by using a protocol adapted from the GenomeWalker Universal kit (Clontech) and NlaIII digestion (lm-pcr1 and -2). In HeLa cells, some of the left ACKNOWLEDGMENTS. This work was supported by National Institutes of Health junctions were identified by using the GenomeWalker Universal kit and Advan- Grants GM071023, GM075019, and DK062345 (to R.M.L.). E.H. was the recipient tage 2 PCR Enzyme System (Clontech) (lm-pcr3 and -4). Primers and cycling of a Charles H. Revson Senior Fellow in Biomedical Science. S.K. was the recipient conditions are described in Table S1. PCR products were cloned into pCR2.1 and of Postdoctoral Fellowship F32-HL678112 from the National Heart Lung and sequenced. Blood Institute.

1. Ward P (2006) Replication of adeno-associated virus DNA. Parvoviruses, eds Kerr JR, 17. Dutheil N, Linden RM (2006) Site-specific integration by adeno-associated virus. Par- Cotmore SF, Bloom ME, Linden RM, Parrish CR (Hodder Arnold, London), pp 189–212. voviruses, eds Kerr JR, Cotmore SF, Bloom ME, Linden RM, Parrish CR (Hodder Arnold, 2. Kotin RM, et al. (1990) Site-specific integration by adeno-associated virus. Proc Natl London), pp 213–237. Acad Sci USA 87(6):2211–2215. 18. Schroder AR, et al. (2002) HIV-1 integration in the human genome favors active genes 3. Samulski RJ, et al. (1991) Targeted integration of adeno-associated virus (AAV) into and local hotspots. Cell 110(4):521–529. human chromosome 19. EMBO J 10(12):3941–3950. 19. McCarty DM, Ryan JH, Zolotukhin S, Zhou X, Muzyczka N (1994) Interaction of the 4. Surosky RT, et al. (1997) Adeno-associated virus Rep proteins target DNA sequences to adeno-associated virus Rep protein with a sequence within the A palindrome of the a unique locus in the human genome. J Virol 71(10):7951–7959. viral terminal repeat. J Virol 68(8):4998–5006. 5. Ponnazhagan S, et al. (1997) Lack of site-specific integration of the recombinant 20. Kabrun N, et al. (1997) Flk-1 expression defines a population of early embryonic adeno-associated virus 2 genomes in human cells. Hum Gene Ther 8(3):275–284. hematopoietic precursors. Development 124(10):2039–2048. 6. McCarty DM, et al. (1994) Identification of linear DNA sequences that specifically bind the adeno-associated virus Rep protein. J Virol 68(8):4988–4997. 21. Kennedy M, et al. (1997) A common precursor for primitive erythropoiesis and defin- 7. Linden RM, Winocour E, Berns KI (1996) The recombination signals for adeno- itive haematopoiesis. Nature 386(6624):488–493. associated virus site-specific integration. Proc Natl Acad Sci USA 93(15):7966–7972. 22. Kotin RM, Berns KI (1989) Organization of adeno-associated virus DNA in latently 8. Im DS, Muzyczka N (1990) The AAV origin binding protein Rep68 is an ATP-dependent infected Detroit 6 cells. Virology 170(2):460–467. site-specific endonuclease with DNA helicase activity. Cell 61(3):447–457. 23. Yang CC, et al. (1997) Cellular recombination pathways and viral terminal repeat 9. Snyder RO, Samulski RJ, Muzyczka N (1990) In vitro resolution of covalently joined AAV hairpin structures are sufficient for adeno-associated virus integration in vivo and in chromosome ends. Cell 60(1):105–113. vitro. J Virol 71(12):9231–9247. 10. Weitzman MD, Kyostio SR, Kotin RM, Owens RA (1994) Adeno-associated virus (AAV) 24. McAlister VJ, Owens RA (2007) Preferential integration of adeno-associated virus type Rep proteins mediate complex formation between AAV DNA and its integration site in 2 into a polypyrimidine/polypurine-rich region within AAVS1. J Virol 81(18):9718– human DNA. Proc Natl Acad Sci USA 91(13):5808–5812. 9726. 11. Urcelay E, Ward P, Wiener SM, Safer B, Kotin RM (1995) Asymmetric replication in vitro 25. Tsunoda H, Hayakawa T, Sakuragawa N, Koyama H (2000) Site-specific integration of from a human sequence element is dependent on adeno-associated virus Rep protein. adeno-associated virus-based plasmid vectors in lipofected HeLa cells. Virology J Virol 69(4):2038–2046. 268(2):391–401. 12. Young SM, Jr, Samulski RJ (2001) Adeno-associated virus (AAV) site-specific recombi- 26. Drew HR, Lockett LJ, Both GW (2007) Increased complexity of wild-type adeno- nation does not require a Rep-dependent origin of replication within the AAV terminal associated virus-chromosomal junctions as determined by analysis of unselected cel- repeat. Proc Natl Acad Sci USA 98(24):13525–13530. lular genomes. J Gen Virol 88 (Pt 6):1722–1732. 13. Hamilton H, Gomos J, Berns KI, Falck-Pedersen E (2004) Adeno-associated virus site- 27. Linden RM, Ward P, Giraud C, Winocour E, Berns KI (1996) Site-specific integration by specific integration and AAVS1 disruption. J Virol 78(15):7874–7882. 14. Tan I, Ng CH, Lim L, Leung T (2001) Phosphorylation of a novel myosin binding subunit adeno-associated virus. Proc Natl Acad Sci USA 93(21):11288–11294. of protein phosphatase 1 reveals a conserved mechanism in the regulation of actin 28. Philpott NJ, Gomos J, Berns KI, Falck-Pedersen E (2002) A p5 integration efficiency cytoskeleton. J Biol Chem 276(24):21209–21216. element mediates Rep-dependent integration into AAVS1 at chromosome 19. Proc 15. Dutheil N, et al. (2004) Characterization of the mouse adeno-associated virus AAVS1 Natl Acad Sci USA 99(19):12381–12385. ortholog. J Virol 78(16):8917–8921. 29. Smith RH, Kotin RM (2000) An adeno-associated virus (AAV) initiator protein, Rep78, 16. Dutheil N, Shi F, Dupressoir T, Linden RM (2000) Adeno-associated virus site-specifically catalyzes the cleavage and ligation of single-stranded AAV ori DNA. J Virol 74(7):3122– integrates into a muscle-specific DNA region. Proc Natl Acad Sci USA 97(9):4862–4866. 3129.

6of6 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806821106 Henckaerts et al. Downloaded by guest on September 23, 2021