1 Genomic DNA Transposition Induced by Human PGBD5 Anton G
Total Page:16
File Type:pdf, Size:1020Kb
1 Genomic DNA transposition induced by human PGBD5 2 3 Anton G. Henssena, Elizabeth Henaffb, Eileen Jianga, Amy R. Eisenberga, Julianne R. Carsona, 4 Camila M. Villasantea, Mondira Raya, Eric Stilla, Melissa Burnsc, Jorge Gandarab, Cedric 5 Feschotted, Christopher E. Masonb, Alex Kentsis a,e,f,1 6 7 8 9 a Molecular Pharmacology Program, Sloan Kettering Institute, New York, NY; 10 b Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY; 11 c Boston Children’s Hospital, Boston, MA; 12 d Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT; 13 e Department of Pediatrics, Memorial Sloan Kettering Cancer Center, New York, NY; 14 f Weill Cornell Medical College, Cornell University, New York, NY. 15 16 17 18 19 1 To whom correspondence may be addressed: Email: [email protected] 20 21 Author contributions: A.H. and A.K. designed the research, A.H., E.H., A.E., E.J., M.B., J.R.C, 22 C.V. performed the research, A.H., E.H., C.F., C.E.M., A.K. analyzed data, A.H. and A.K. wrote 23 the manuscript in consultation with all authors. 24 25 The authors declare no conflict of interest. 26 27 Data deposition: The data reported in this manuscript have been deposited to the Sequence Read 28 Archive (SRA) (http://www.ncbi.nlm.nih.gov/sra/, accession number SRP061649), the Dryad 29 Digital Repository (http://datadryad.org, doi:10.5061/dryad.b2hc1), and the Zenodo computer 30 archive (http://dx.doi.org/10.5281/zenodo.22206). Detailed experimental protocols have been 31 posted at the Nature Protocol Exchange 32 (http://www.nature.com/protocolexchange/labgroups/38333). 33 34 This article contains supporting information. 35 1 36 Abstract 37 Transposons are mobile genetic elements that are found in nearly all organisms, including 38 humans. Mobilization of DNA transposons by transposase enzymes can cause genomic 39 rearrangements, but our knowledge of human genes derived from transposases is limited. Here, 40 we find that the protein encoded by human PGBD5, the most evolutionarily conserved 41 transposable element-derived gene in vertebrates, can induce stereotypical cut-and-paste DNA 42 transposition in human cells. Genomic integration activity of PGBD5 requires distinct aspartic 43 acid residues in its transposase domain, and specific DNA sequences containing inverted 44 terminal repeats with similarity to piggyBac transposons. DNA transposition catalyzed by 45 PGBD5 in human cells occurs genome-wide, with precise transposon excision and preference for 46 insertion at TTAA sites. The apparent conservation of DNA transposition activity by PGBD5 47 suggests that genomic remodeling contributes to its biological function. 2 48 Introduction 49 Transposons are genetic elements that are found in nearly all living organisms (1). They 50 can contribute to the developmental and adaptive regulation of gene expression and are a major 51 source of genetic variation that drives genome evolution (2). In humans and other mammals, they 52 comprise about half of the nuclear genome (3). The majority of primate-specific sequences that 53 regulate gene expression are derived from transposons (4), and transposons are a major source of 54 structural genetic variation in human populations (5). 55 While the majority of genes that encode transposase enzymes tend to become 56 catalytically inactive and their transposon substrates tend to become immobile in the course of 57 organismal evolution, some can maintain their transposition activities (6, 7). In humans, at least 58 one hundred L1 long interspersed repeated sequences (LINEs) actively transpose in human 59 genomes and induce structural variation (8), including somatic rearrangements in neurons that 60 may contribute to neuronal plasticity (9). The human Transib-like transposase RAG1 catalyzes 61 somatic recombination of the V(D)J receptor genes in lymphocytes, and is essential for adaptive 62 immunity (10). The Mariner-derived transposase SETMAR functions in single-stranded DNA 63 resection during DNA repair and replication in human cells and can catalyze DNA transposition 64 in vitro (7, 11). 65 Among transposase enzymes that can catalyze excision and insertion of transposon 66 sequences, DNA transposases are distinct in their dependence only on the availability of 67 competent genomic substrates and cellular repair enzymes that ligate and repair excision sites, as 68 compared to retrotransposons which require transcription of the mobilized sequences (12). Most 69 DNA transposases utilize an RNase H-like domain with three aspartate or glutamate residues 70 (so-called DDD or DDE motif) that catalyze magnesium-dependent hydrolysis of phosphodiester 3 71 bonds and strand exchange (13-15). The IS4 transposase family, which includes piggyBac 72 transposases, is additionally distinguished by precise excisions without modifications of the 73 transposon flanking sequences (16). The piggyBac transposase and its transposon were originally 74 identified as an insertion in lepidopteran Trichoplusia ni cells (17). The piggyBac transposon 75 consists of 13-bp inverted terminal repeats (ITR) and 19-bp subterminal inverted repeats located 76 3 and 31 base pairs from the 5’ and 3’ ITRs, respectively (18). PiggyBac transposase can 77 mobilize a variety of ITR-flanked sequences and has a preference for integration at TTAA target 78 sites in the host genome (15, 18-23). 79 Members of the piggyBac superfamily of transposons have colonized a wide range of 80 organisms (24), including a recent and likely ongoing invasion of the bat M. lucifugus (25). The 81 human genome contains 5 paralogous genes derived from piggyBac transposases, PGBD1-5 (24, 82 26). PGBD1 and PGBD2 invaded the common mammalian ancestor, and PGBD3 and PGBD4 83 are restricted to primates, but are all contained as single coding exons, fused in frame with 84 endogenous host genes, such as the Cockayne syndrome B gene (CSB)-PGBD3 fusion (24, 27). 85 Thus far, only the function of PGBD3 has been investigated. CSB-PGBD3 is capable of binding 86 DNA, including endogenous piggyBac-like transposons in the human genome, but has no known 87 catalytic activity, though biochemical and genetic evidence indicates that it may participate in 88 DNA damage response (28, 29). 89 PGBD5 is distinct from other human piggyBac-derived genes by having been 90 domesticated much earlier in vertebrate evolution approximately 500 million years (My) ago, in 91 the common ancestor of cephalochordates and vertebrates (24, 30). PGBD5 is transcribed as a 92 multi-intronic but non-chimeric transcript predicted to encode a full-length transposase (30). 93 Furthermore, PGBD5 expression in both human and mouse appears largely restricted to the early 4 94 embryo and certain areas of the embryonic and adult brain (24, 30). These intriguing features 95 prompted us to investigate whether human PGBD5 has retained the enzymatic capability of 96 mobilizing DNA. 5 97 Results 98 Human PGBD5 contains a C-terminal RNase H-like domain that has approximately 20% 99 sequence identity and 45% similarity to the active lepidopteran piggyBac, ciliate piggyMac, and 100 mammalian piggyBat transposases (Figure 1) (24, 25, 31). In addition, the human genome 101 contains 2,358 sequence elements with similarity to the piggyBac transposable elements (Table 1 102 and Figure 2A). Specifically, MER75 (MER75, MER75A, MER75B) and MER85 elements 103 show considerably similar inverted terminal repeat sequences as compared to lepidopteran 104 piggyBac transposons (Table 2 and Figure 2B). A total of 328 piggyBac-like elements in the 105 human genome have intact inverted terminal repeats and exhibit duplications of their presumed 106 TTAA target sites (Table 1 and Figure 2C). We reasoned that even though the ancestral 107 transposon substrates of PGBD5 cannot be predicted due to its very ancient evolutionary origin 108 (~500 My), preservation of its transposase activity should confer residual ability to mobilize 109 distantly related piggyBac-like transposons. To test this hypothesis, we used a synthetic 110 transposon reporter PB-EF1-NEO comprised of a neomycin resistance gene flanked by T. ni 111 piggyBac ITRs (Figure 3B) (20, 32). We transiently transfected human embryonic kidney (HEK) 112 293 cells, which lack endogenous PGBD5 expression with the PB-EF1-NEO transposon reporter 113 plasmid in the presence of a plasmid expressing PGBD5, and assessed genomic integration of the 114 reporter using clonogenic assays in the presence of G418 to select cells with genomic integration 115 conferring neomycin resistance (Figure 3C, Figure 3 – figure supplement 1). Given the absence 116 of suitable antibodies to monitor PGBD5 expression, we expressed PGBD5 as an N-terminal 117 fusion with the green fluorescent protein (GFP). We observed significant rates of neomycin 118 resistance of cells conferred by the transposon reporter with GFP-PGBD5, but not in cells 119 expressing control GFP or mutant GFP-PGBD5 lacking the transposase domain (Figure 3C), 120 despite equal expression of all transgenes (Figure 3 – figure supplement 2). The efficiency of 6 121 neomycin resistance conferred by the transposon reporter with GFP-PGBD5 was approximately 122 4.5-fold less than that of the T. ni piggyBac-derived transposase (Figure 3D), consistent with 123 their evolutionary divergence. These results suggest that human PGBD5 can promote genomic 124 integration of a piggyBac-like transposon. 125 If neomycin resistance conferred by the PGBD5 and the transposon reporter is due to 126 genomic integration and DNA transposition, then this should require specific activity on the 127 transposon ITRs. To test this hypothesis, we generated transposon reporters with mutant ITRs 128 and assayed them for genomic integration (Figure 3B, Figure 3 – figure supplement 3). DNA 129 transposition by the piggyBac family transposases involves hairpin intermediates with a 130 conserved 5’-GGGTTAACCC-3’ sequence that is required for target site phosphodiester 131 hydrolysis (15). Thus, we generated reporter plasmids lacking ITRs entirely or containing 132 complete ITRs with 5’-ATATTAACCC-3’ mutations predicted to disrupt the formation of 133 productive hairpin intermediates (15).