<<

Integration by design

Suzanne Sandmeyer* Department of Biological Chemistry, College of Medicine, University of California, Irvine, CA 92697-1700

s goes history, so goes research: are broadly accessed by retro- ends of the full-length DNA mediates this year, activity in areas of , but that there are decidedly integration into host DNA. Isolation research related only nonrandom patterns as well (8). More first of preintegration complexes from Aindirectly have provoked events recently, large numbers of HIV type 1 infected cells and then production of that are notable when considered to- (HIV-1) insertions have been mapped and active, recombinant IN allowed exami- gether. Last summer it was reported compared with genomewide transcrip- nation of the effect of different target that a patient in one X-linked severe tion patterns to globally probe the rela- features on integration in vitro. A gener- combined immunodeficiency retroviral tionship between gene expression and alization that has emerged from studies vector gene therapy trial had developed retrovirus integration (9). These experi- conducted in several laboratories is that leukemia. Now disquietingly, there has ments showed that HIV-1 insertion fa- bending of DNA favors integration (18), been a second such event, and a third vors transcribed regions. Nonetheless, as do hairpin structures (19). The patient is reported to have a vector in- the basis of the preference for tran- former occurs in nucleosomes, which, sertion near the same gene (LMO2)as scribed regions has been elusive, and contrary to expectations, were found to observed in the other two individuals examination of at least one transcribed act as preferred targets over nonnucleo- (1). Meanwhile, in a basic research labo- region for effects of transcriptional ac- somal DNA, both in vitro and in vivo ratory, experiments have moved us an- tivity on integration activity have not (20–22). other step closer to understanding the shown a positive correlation (10). The relatively global distribution of mechanics of insertion specificity for At the heart of retroviral integration retrovirus integration sites stands in retrovirus-type integrases (IN). As re- is the IN. It is a member of the interesting contrast to the distinctive ported in this issue of PNAS, investiga- D,D(35)E transposase͞IN superfamily insertion preferences of their LTR- tors have produced active retroviruslike named after its conserved catalytic triad cousins, the Pseudo- elements with synthetic insertion speci- viridiae (e.g., Ty1 and Ty5 copialike ele- ficities (2). Dan Voytas and colleagues ments) (23) and the (e.g., at Iowa State University (Ames) study Potential deleterious Tf1 and Ty3 gypsylike elements) (24). the Saccharomyces long terminal repeat IN encoded by these elements (LTR)-retrotransposon Ty5, which tar- retrovirus insertions have the zinc-binding motif, the highly gets heterochromatic regions (3). Now, conserved residues of the central do- in an elegant adaptation of the two-hy- fueled investigation main and the poorly conserved C-termi- brid system, the 6-aa Ty5 targeting do- nal domain. The IN proteins of the main (TD) was exchanged for two heter- into the mechanism of and the Metaviridae differ ologous domains shown to mediate from each other in the C-terminal do- interaction of their respective proteins insertion site selection. main where the Pseudoviridae have a conserved GKGY motif (23), and the with partners. When domains ͞ from those partners were produced Metaviridae have a conserved GPF Y motif. Some members of the Metaviridae fused to the LexA DNA-binding do- of amino acids. Because of its central also have a chromodomain (24). main, targeting to LexA-binding sites role in the retrovirus lifecycle, the func- As a group, the yeast LTR retrotrans- was observed. Although integration tion and structure of this has posons have notable insertion prefer- specificity in the system was by no been studied extensively (reviewed in ences. The specificity of Ty5 for hetero- means absolute, these results are of in- refs. 11–15). Retroviral IN mediates a chromatin is discussed further below. In terest to genetic engineers and future strand transfer of LTR DNA 3Ј OH budding yeast, the Pseudoviridae Ty1, 2, gene therapists. ends to staggered positions in the host and 4 reside mostly within 750 bp of the DNA (16, 17). Combined evidence of Interest in the integration patterns of 5Ј ends of tRNA genes (25, 26). In vivo is longstanding. Despite the many types shows a retroviral IN with insertions fall along a gradient begin- potential danger of deleterious activat- three physically distinct domains. An Ϫ Ј ␣ ning at about 80 bp from the 5 cod- ing or even inactivating insertions, retro- N-terminal domain includes three - ing end of the tRNA gene and extend- viruses present compelling advantages as helices and a zinc-binding motif. This ing upstream. Integration appears to rise therapy vectors (reviewed in ref. 4). domain has been implicated in dimeriza- and fall in a pattern which could corre- Early investigations of oncogenic retro- tion and in binding the LTR ends. The late with some feature of the nucleo- insertion sites in transformed cells central domain contains the conserved some (27). The pattern of integration of showed that insertions were linked to catalytic triad D,D(35)E. Members of the Metaviridae element Ty3 is even activation of flanking oncogenes or this triad coordinate a divalent metal more restricted. The gene-proximal 2ϩ DNaseI hypersensitive sites, leading to cation, probably Mg in vivo (15) and strand transfer in this case occurs within the notion that insertion into open chro- are essential for catalytic activity. The one or two of tRNA gene matin was favored (reviewed in ref. 5; C-terminal domain contributes to oli- transcription initiation sites. In vivo it is see also refs. 6 and 7). The potential for gomerization, has nonspecific DNA- likely that transcription factors TFIIIB deleterious retrovirus vector insertions binding activity and is physically similar and TFIIIC are essential for Ty3 target- fueled investigation into the mechanistic to the SH3 protein interaction domain. ing (28–30). Furthermore, it has been basis of insertion site selection. Devel- No full-length IN structure has yet been opment of PCR assays with which signif- determined at high resolution. icant numbers of retrovirus integration In vivo a retroviral preintegration See companion article on page 5891. sites could be mapped showed that complex composed of IN bound to the *E-mail: [email protected].

5586–5588 ͉ PNAS ͉ May 13, 2003 ͉ vol. 100 ͉ no. 10 www.pnas.org͞cgi͞doi͞10.1073͞pnas.1031802100 Downloaded by guest on September 29, 2021 COMMENTARY

shown that yeast elements Ty1–4 target other genes transcribed by RNA poly- merase III with similar patterns to those observed flanking tRNA genes (27, 30). In vitro, Ty3 targeting to the U6 gene requires only TATA-binding protein and Brf1 (29). Observation of highly specific integra- tion in yeast helped to motivate a series of experiments to confer novel insertion specificities on retrovirus IN proteins (reviewed in refs. 31 and 32). Recombi- nant retroviral IN has been expressed as a fusion with relatively compact DNA- binding domains including lambda re- pressor (33), LexA DNA-binding do- main (34, 35), and the DNA-binding domain of Zif268 (36). Recombinant proteins have been shown to target in vitro integration to the respective DNA- binding sites of the fusion proteins. Dis- appointingly, these chimeric IN species, appear to be incompatible with high lev- els of infectious virus. Presumably this is caused by some failure to structurally Fig. 1. Strategy for retargeting Ty5 integration. Top, schematic of Ty5 single ORF encoding RNA binding accommodate the heterologous domain. (RB), protease (PR), integrase (IN), (RT), and marker gene (his3AI) (open box). View To circumvent some of these problems, of IN is expanded to show conserved residues and targeting domain (TD) (solid). Lower left, preintegration a strategy involving trans expression of complex showing wild-type IN bound to ends of Ty5 DNA (thick line) and integrating into telomeric IN has been used. In this variation, a heterochromatin, mediated by Sir4p (hatched). Lower right, same as left except that the natural TD is fusion of HIV-1 structural protein p6 to replaced with heterologous domains (TD*) from Rad9p and NpwBP (solid). LexA DNA-binding domain an IN-LexA targeting domain directs IN (open) is expressed fused to Sir4pC, Rad53p FHA1, and Npw38 WW domains (hatched). Integration occurs proximal to LexA-binding sites (open arrows) in plasmid target (closed circle). to the virion and complements catalyti- cally defective IN contributed from Gag- Pol (37, 38). However, there are no nat- ruption of IN. A 13-aa sequence in tion of the majority of (nontarget plas- urally occurring LexA-binding sites in Rad9p mediates its interaction with a mid) Ty5 integrations? Do nonplasmid mammalian cells, and targeting to syn- forkhead-associated domain (FHA1) in insertions default to random, to native thetic sites has not yet been reported. another DNA repair protein, Rad53p. A Rad53p direction in the case of the Ty5 is distinct among the yeast ele- 12-aa domain in NpwBP mediates inter- Rad9p-based TD, or do natural, as yet ments. Originally identified as a degen- action with the WW domain of another unidentified, functions continue to oper- erate element at the ends of Saccharo- nuclear protein Npw38. The Rad9p and ate on the Ty5 IN? Is it possible to gen- myces cerevisiae chromosomes (39), the NpwBP domains were substituted for erate integration that is more highly Voytas laboratory recovered an active the natural Ty5 TD. The partner inter- restricted, perhaps through the use of copy from Saccharomyces paradoxus and acting domains (i.e., FHA1 from phage panning or slightly larger transferred it into S. cerevisiae (3). In Rad53p and WW from Npw38) were domains? this context, they showed that Ty5 in- expressed fused to the LexA DNA-bind- The experiments by Voytas suggest serted into heterochromatic DNA (40). ing domain. Yeast were transformed many new avenues for explora- Mutations in Sir3p or Sir4p that dis- with the synthetic Ty5 TD elements, tion. The occurrence of a compact and rupted silencing of telomeric DNA also constructs from which fusion DNA- independent interaction domain in a resulted in loss of targeting to silenced binding domains were expressed, and a retroviral-type IN of course poses the regions (41). The pieces of the puzzle target plasmid containing LexA-binding question of whether other such domains fell quickly into place. A targeting do- sites embedded in Arabidopsis DNA. exist. In the case of Ty3, interactions main of 6 aa (TD), virtually at the C- Target plasmids were recovered in Esch- between the N-terminal domain and terminus of Ty5 IN, was mapped, which erichia coli for analysis. For Ty5-TD and TFIIIC subunit Tfc1p have been docu- was required for targeting (42) and Ty5-Rad9p targeting, 26 integrant joints mented in vitro and are consistent with which mediated interactions with a large were sequenced and shown to be within in vivo results (44). Ty3 also has a rela- C-terminal portion of Sir4p (43). 120 bp of LexA-binding sites, and of 18 tively extended C-terminal domain that In the current article (2), the Voytas further analyzed, all had the direct could interact with targeting proteins laboratory accomplishes design-based flanking repeats characteristic of bona including TFIIIB subunits, but this has integration. The strategy is outlined fide integrants. In the case of targeting not been demonstrated. It seems likely in Fig. 1. They fused the LexA DNA- to Sir4p-, Rad53p FHA1-, and Npw38 that the S. cerevisiae Pseudoviridae ele- binding domain to one of several TD- WW-LexA fusions and target plasmids ment Ty1 will be targeted by some fea- interacting domains: first the C-terminal with four copies of the LexA operator, ture of chromatin which distinguishes domain of Sir4p (Sir4pC). Next the 6-aa about one-sixth of transposition was into regions directly upstream of tRNA IN TD and the Sir4pC fusion domains the target. genes (27). An alignment of Metaviridae were swapped with two pairs of heterol- Many questions remain. For example, element IN C-terminal domains recently ogous partner domains. Such domains how does Ty5 access the DNA after resulted in the identification of a chro- were carefully chosen to minimize dis- docking at Sir4p? What is the distribu- modomain motif (24). Tf1, a Schizosac-

Sandmeyer PNAS ͉ May 13, 2003 ͉ vol. 100 ͉ no. 10 ͉ 5587 Downloaded by guest on September 29, 2021 charomyces pombe element of this class of characterized retroviral IN proteins to be better tolerated by the virion. A has been shown to insert in inter-ORF has an SH3 structure and the SH3 motif second point is that the known structure spaces, apparently with preference for mediates a wide variety of protein inter- of the C-terminal domain of retroviral the region within 100–300 bp from the actions albeit mostly having to do with IN might be used to identify positions ORF initiation codon (45, 46). Results signal transduction (47). In addition, it actually within the IN, which are com- of recent experiments suggest that Tf1 has been shown that several chromatin- patible with replacements or insertions integration is actually targeted through related proteins enhance retroviral inte- of small TD cassettes. The Ty5 study interaction of the chromodomain with gration in vitro and potentially in vivo; underscores the findings from in vitro histone H3 methylated at K4 (H. Levin, one such case is INI1 (48), and another targeting studies with retroviral IN, National Institutes of Health, Bethesda, is LEDGF͞p75 (49). The recent findings namely that the C-terminal domain can deliver active IN to the integration site. personal communication). These obser- in yeast are likely to encourage further Finally, although protein–protein media- vations are exciting because they not exploration for proteins that contribute tion of IN docking does not have the only hint at the subtlety and diversity of to the loosely defined preference of at reassuring simplicity of an IN that binds integration specificity, but suggest that least some retroviruses for insertion into unique DNA sequences, it offers the integration can be used to learn about transcriptionally active regions and into rich combinatorial complexity of the chromatin structure as well as to manip- particular hotspots. natural proteome. ulate the genome. What are the lessons that could be Clearly, much work remains to ex- It is not clear to what extent retroviral applied to better laboratory retrovirus plore the mechanisms, implications, and proteins will be shown to interact with vectors, or even make safer therapeutic applications of targeted retroviral inte- specific proteins for targeting in the vectors? One observation, so obvious it gration. Integration by design in a manner observed for the yeast LTR ret- can hardly be considered a lesson, is model organism from the Voytas labo- rotransposons. The C-terminal domain that relatively subtle changes are likely ratory hints at the possibilities.

1. Kaiser, J. (2003) Science 299, 991. 19. Katz, R. A., Gravuer, K. & Skalka, A. M. (1998) 35. Katz, R. A., Merkel, G. & Skalka, A. M. (1996) 2. Zhu, Y., Dai, J., Fuerst, P. G. & Voytas, D. F. J. Biol. Chem. 273, 24190–24195. Virology 217, 178–190. (2003) Proc. Natl. Acad. Sci. USA 100, 5891–5895. 20. Pryciak, P. M., Mu¨ller, H.-P. & Varmus, H. E. 36. Bushman, F. D. & Miller, M. D. (1997) J. Virol. 71, 3. Zou, S., Ke, N., Kim, J. M. & Voytas, D. F. (1996) (1992) Proc. Natl. Acad. Sci. USA 89, 9237– 458–464. Genes Dev. 10, 634–645. 9241. 37. Holmes-Son, M. L. & Chow, S. A. (2000) J. Virol. 4. Galimi, F. & Verma, I. M. (2002) Curr. Top. 21. Pryciak, P. M. & Varmus, H. E. (1992) Cell 69, 74, 11548–11556. Microbiol. Immunol. 261, 245–254. 769–780. 38. Holmes-Son, M. L. & Chow, S. A. (2002) Mol. 5. Sandmeyer, S. B., Hansen, L. J. & Chalker, D. L. 22. Pruss, D., Bushman, F. D. & Wolffe, A. P. (1994) Ther. 5, 360–370. (1990) Annu. Rev. Genet. 24, 491–518. Proc. Natl. Acad. Sci. USA 91, 5913–5917. 39. Voytas, D. F. & Boeke, J. D. (1992) Nature 358, 6. Scherdin, U., Rhodes, K. & Breindl, M. (1990) 23. Peterson-Burch, B. D. & Voytas, D. F. (2002) Mol. 717. J. Virol. 64, 907–912. Biol. Evol. 19, 1832–1845. 40. Zou, S. & Voytas, D. F. (1997) Proc. Natl. Acad. 7. Mooslehner, K., Karls, U. & Harbers, K. (1990) 24. Malik, H. S. & Eickbush, T. H. (1999) J. Virol. 73, Sci. USA 94, 7412–7416. J. Virol. 64, 3056–3058. 5186–5190. 41. Zhu, Y., Zou, S., Wright, D. A. & Voytas, D. F. 8. Withers-Ward, E. S., Kitamura, Y., Barnes, J. P. & 25. Ji, H., Moore, D. P., Blomberg, M. A., Braiterman, (1999) Genes Dev. 13, 2738–2749. Coffin, J. M. (1994) Genes Dev. 8, 1473–1487. L. T., Voytas, D. F., Natsoulis, G. & Boeke, J. D. 42. Gai, X. & Voytas, D. F. (1998) Mol. Cell 1, 9. Schroder, A. R., Shinn, P., Chen, H., Berry, C., (1993) Cell 73, 1007–1018. 1051–1055. Ecker, J. R. & Bushman, F. (2002) Cell 110, 26. Kim, J. M., Vanguri, S., Boeke, J. D., Gabriel, A. 43. Xie, W., Gai, X., Zhu, Y., Zappulla, D. C., 521–529. & Voytas, D. F. (1998) Genome Res. 8, 464–478. Sternglanz, R. & Voytas, D. F. (2001) Mol. Cell. 10. Weidhaas, J. B., Angelichio, E. L., Fenner, S. & 27. Devine, S. E. & Boeke, J. D. (1996) Genes Dev. 10, Biol. 21, 6606–6614. Coffin, J. M. (2000) J. Virol. 74, 8382–8389. 620–633. 44. Aye, M., Dildine, S. L., Claypool, J. A., Jourdain, 11. Haren, L., Ton-Hoang, B. & Chandler, M. (1999) 28. Kirchner, J., Connolly, C. M. & Sandmeyer, S. B. Mol. Cell. Biol. 21, Annu. Rev. Microbiol. 53, 245–281. (1995) Science 267, 1488–1491. S. & Sandmeyer, S. B. (2001) 12. Hindmarsh, P. & Leis, J. (1999) Microbiol. Mol. 29. Yieh, L., Kassavetis, G., Geiduschek, E. P. & Sand- 7839–7851. Biol. Rev. 63, 836–843. meyer, S. B. (2000) J. Biol. Chem. 275, 29800– 45. Behrens, R., Hayles, J. & Nurse, P. (2000) Nucleic 13. Wlodawer, A. (1999) Adv. Virus Res. 52, 335–350. 29807. Acids Res. 28, 4709–4716. 14. Craigie, R. (2001) J. Biol. Chem. 276, 23213– 30. Chalker, D. L. & Sandmeyer, S. B. (1992) Genes 46. Singleton, T. L. & Levin, H. L. (2002) Eukaryot. 23216. Dev. 6, 117–128. Cell 1, 44–55. 15. Rice, P. A. & Baker, T. A. (2001) Nat. Struct. Biol. 31. Bushman, F. D. (2002) Curr. Top. Microbiol. Im- 47. Mayer, B. J. (2001) J. Cell Sci. 114, 1253–1263. 8, 302–307. munol. 261, 165–177. 48. Kalpana, G. V., Marmon, S., Wang, W., 16. Fujiwara, T. & Mizuuchi, K. (1988) Cell 54, 497– 32. Holmes-Son, M. L., Appa, R. S. & Chow, S. A. Crabtree, G. R. & Goff, S. P. (1994) Science 266, 504. (2001) Adv. Genet. 43, 33–69. 2002–2006. 17. Craigie, R. & Mizuuchi, K. (1985) Cell 41, 867– 33. Bushman, F. (1994) Proc. Natl. Acad. Sci. USA 91, 49. Cherepanov, P., Maertens, G., Proost, P., 876. 9233–9237. Devreese, B., Van Beeumen, J., Engelborghs, Y., 18. Mu¨ller, H.-P. & Varmus, H. E. (1994) EMBO J. 13, 34. Goulaouic, H. & Chow, S. A. (1996) J. Virol. 70, De Clercq, E. & Debyser, Z. (2003) J. Biol. Chem. 4704–4714. 37–46. 278, 372–381.

5588 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.1031802100 Sandmeyer Downloaded by guest on September 29, 2021