<<

Downloaded from .cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Letter Multiple waves of recent DNA transposon activity in the , Myotis lucifugus

David A. Ray,1,5,6 Cedric Feschotte,2,5 Heidi J.T. Pagan,1 Jeremy D. Smith,1 Ellen J. Pritham,2 Peter Arensburger,3 Peter W. Atkinson,3 and Nancy L. Craig4 1Department of Biology, West Virginia University, Morgantown, West Virginia 26506, USA; 2Department of Biology, University of Texas at Arlington, Arlington, Texas 76019, USA; 3Department of Entomology and Institute for Integrative Genome Biology, University of California, Riverside, California 92521, USA; 4Howard Hughes Medical Institute, Department of Molecular Biology & , Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA

DNA transposons, or class 2 transposable elements, have successfully propagated in a wide variety of . However, it is widely believed that DNA transposon activity has ceased in mammalian genomes for at least the last 40 million years. We recently reported evidence for the relatively recent activity of hAT and elements, two distinct groups of DNA transposons, in the lineage of the vespertilionid bat Myotis lucifugus. Here, we describe seven additional families that have also been recently active in the bat lineage. Early vespertilionid genome evolution was dominated by the activity of Helitrons, mariner-like and Tc2-like elements. This was followed by the colonization of Tc1-like elements, and by a more recent explosion of hAT-like elements. Finally, and most recently, piggyBac-like elements have amplified within the Myotis genome and our results indicate that one of these families is probably still expanding in natural populations. Together, these data suggest that there has been tremendous recent activity of various DNA transposons in the bat lineage that far exceeds those previously reported for any mammalian lineage. The diverse and recent populations of DNA transposons in genus Myotis will provide an unprecedented opportunity to study the impact of this class of elements on mammalian genome evolution and to better understand what makes some species more susceptible to invasion by genomic parasites than others. [Supplemental material is available online at www.genome.org.]

Transposable elements represent a substantial fraction of many man DNA transposon families younger than ∼37 Myr (Pace and eukaryotic genomes. For example, ∼50% of the genome is Feschotte 2007). derived from sequences (Lander et al. In stark contrast to the situation in human and mouse, there 2001). Other genomes, especially those of plants, may consist of is mounting evidence for recent and substantial DNA transposon substantially higher proportions of transposable element-derived activity in the vespertilionid bat Myotis lucifugus. In one study, DNA (SanMiguel and Bennetzen 1998; Bennetzen 2000). Trans- this conclusion was inferred from the presence of minimally di- posable elements are typically divided into two classes (Wicker et verged nonautonomous hAT transposons, the polymorphic sta- al. 2007). Class 1 is represented by the (LINEs, tus of some of these transposons in natural Myotis populations, SINEs, LTRs, and ERVs). Class 2 includes the “cut-and-paste” and the discovery of an apparently full-length and potentially DNA transposons, which are characterized by terminal inverted functional autonomous hAT transposon (Ray et al. 2007). In a repeats (TIRs) and are mobilized by an element-encoded trans- separate report, an extensive and recently amplified population posase. Currently, 10 superfamilies of cut-and-paste DNA trans- of Helitrons, a distinct subclass of DNA transposons, was discov- posons are recognized in (Feschotte and Pritham ered in the M. lucifugus genome (Pritham and Feschotte 2007). 2007). These surprising observations raised a number of intriguing ques- While class 2 elements are widespread and active in a variety tions about the activity of DNA transposons and its genomic of eukaryotes, they have been thought to be transpositionally impact in Myotis and other . In particular, we wondered inactive in mammalian genomes. This conclusion was based pri- whether the discovery of relatively recent hAT and Helitron ac- marily on the initial analyses of the human and mouse genome tivity in Myotis were merely coincidental or whether it reflected sequences. While both species harbor a significant number and a some unique biological or genomic features of bats in general or diverse assortment of DNA transposons, they show no signs of vespertilionids in particular that has allowed DNA transposons to recent activity (Lander et al. 2001; Waterston et al. 2002). For invade and expand in these genomes, while such occurrences example, there are more than 300,000 DNA elements recogniz- appear to be rare or nonexistent in other mammalian genomes. able in the , which are grouped into 120 families As a prelude to addressing these questions we examined M. lu- and belong to five superfamilies. A large subset of these elements cifugus for additional traces of DNA transposon activity. Conse- (40 families; ∼98,000 copies) were integrated in the last 40–80 quently, we explored the diversity, abundance, and evolutionary million years (Myr), but there remains no evidence for any hu- history of DNA transposons in the M. lucifugus genome. Herein, we characterize seven new families of DNA trans- 5These authors contributed equally to this work. posons from the current whole-genome sequence of M. lucifugus. 6 Corresponding author. In addition, we provide additional information on the previously E-mail [email protected]; fax (304) 293-6363. Article published online before print. Article and publication date are at http:// described Myotis_hAT transposons (Ray et al. 2007). Together, www.genome.org/cgi/doi/10.1101/gr.071886.107. these elements represent three distinct superfamilies and exhibit

18:000–000 ©2008 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/08; www.genome.org Genome Research 1 www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Ray et al. clear signs of recent activity in the vespertilionid lineage, that is, Tigger-like elements called Tigger1_ML. A tentative consensus was within the last 40–50 Myr. We also identified a ninth transposon reconstructed from the alignment of the seven longest copies. lineage that likely expanded prior to the chiropteran divergence. The size (2.8 kb) and TIRs are consistent with other Tigger ele- One family represents the youngest DNA transposon family so ments and the consensus is similar to consensus sequences in far recorded in any mammalian species, one that is likely still Repbase as Tigger1a_CAR and Tigger1a_ART, which represent Tig- expanding in the genome. The discovery of this unprecedented ger1-like families specific to carnivores and artiodactyls, respec- level of DNA transposon activity in a mammalian genome rep- tively. We also identified several highly eroded copies of a related resents a dramatic shift in our view of mobile element biology in family of Tigger1 transposons in the European hedgehog, Erina- . ceus europaeus (e.g., accession AANN01830819, position 1599– 4234). Given that hedgehog, carnivores, artiodactyls, and chirop- terans are all part of the well-supported super- Results order, we hypothesize that Tigger1_ML amplified during the early part of the Laurasiatherian radiation. This is by far the oldest hAT superfamily family of DNA transposons that we identified and, as it does not The consensus sequence for Myotis_hAT1 was previously de- appear to be unique to chiropterans, we have chosen not to focus scribed (Ray et al. 2007). To summarize, the entire sequence on the details of this family in the current study. spans 2921 nucleotides with a single ORF consisting of bases The Tc2 group is sister to the pogo lineage and was first 700–2631 that encodes an apparently intact transposase of 643 identified in Caenorhabditis elegans (Ruvolo et al. 1992). There are amino acids (aa). The present analysis confirms the characteristic relatively few Tc2-like elements in human, and their amplifica- target-site duplications (TSDs) for Myotis hAT elements, 8 bp with tion predates the split of eutherian mammals (Pace and Feschotte a central TA dinucleotide, and the typical short TIR (terminal 2007). In contrast, we identified a distinct and much more recent inverted repeats) sequence. Myotis_hAT1 and its nonautonomous family of Tc2-like elements in M. lucifugus. The Tc2_1_ML con- derivatives are by far the most abundant family of elements ana- sensus is 1728 bp with 23 bp TIRs similar to known or newly lyzed in this study, with >96,000 hits of Ն100 bp in the current identified Tc2 elements. It contains a single ORF spanning posi- WGS data. tions 213–1559 and encoding a putative transposase of 448 aa. We identified a second family of hAT elements, called Phylogenetic analysis confirms that Tc2_1_ML belongs to the Tc2 hAT2_ML. The TIRs of hAT2_ML are highly similar to Myotis_ group, but does not cluster with either of the mammalian Tc2- hAT1, but the internal sequence is only weakly similar. hAT2_ML like transposases identified in human (Kanga), opposum is predicted to encode a 428-aa transposase with only minimal (Tc2_MD2), and tenrec (Tc2_Et), or with the human transposase- sequence similarity and 33% amino acid identity to the Myotis_ derived proteins POGK and POGZ (Fig. 1B). Thus, Tc2_1_ML de- hAT1 transposase. Phylogenetic comparison to other hAT trans- fines a distinct lineage of Tc2-like elements. Tc2_1_ML is the least posases confirms that Myotis_hAT1 and hAT2_ML belong to the abundant of the DNA transposon families described in this study Ն hAT superfamily and are closely related to hAT1_MD (Repbase with only 589 hits 100 bp. Update, volume 12, issue 10), a hAT transposon family recently Two distinct lineages of canonical D34D mariner are known identified in the opossum Monodelphis domestica (Fig. 1A). Re- to occur in mammals: the cecropia subfamily, represented by peatMasker results using the full consensus as a query suggests Hsmar1 in anthropoid primates and Mmar1 in mouse, and the that hAT2_ML is not nearly as widespread as Myotis_ irritans subfamily, represented by Hsmar2 in primates (Robertson hAT1 with only 2004 hits of Ն100 bp. 2002). We detected an abundant (2036 hits >100 bp) family of We also recovered a third hAT family member, hAT3_ML, transposons in M. lucifugus, Mlmar1, that displays all the features from the M. lucifugus genome. The consensus sequence contains of a canonical D34D mariner. The Mlmar1 consensus is 1287 bp a large ORF spanning base pairs 1595–3403 and encoding a pu- with 30-bp TIRs and a single ORF, spanning position 174–1184 tative protein of 602 aa, which nests in phylogenetic analysis and predicted to encode a 336-aa transposase. Surprisingly, phy- within a clade encompassing several mammalian hAT trans- logenetic analyses robustly place the predicted Mlmar1 trans- posases (Fig. 1A). With its 3904-bp consensus, it is the longest of posase within the mauritiana subfamily of mariner (Fig. 1C), a all of the elements described in this study and is second to Myo- subfamily typified by the original mos1 element active in Dro- tis_hAT1 in terms of its abundance with >12,000 elements of sophila mauritiana (Hartl et al. 1997; Robertson 2002). We could Ն100 bp identified in the current WGS assembly of M. lucifugus. not detect any members of the irritans or cecropia subfamilies of mariner in M. lucifugus using protein queries representing these lineages, although these two lineages have been identified in a Tc1/mariner superfamily broad range of mammals (Demattei et al. 2000; Robertson 2002; Eukaryotic Tc1/mariner elements are divided into several an- Waterston et al. 2002; Sinzelle et al. 2006), including other bat ciently diverged lineages (Robertson 2002; Feschotte and Pritham species (see below). 2007). Three distinct Tc1/mariner lineages have been previously Tc1-like elements represent another distinct lineage of Tc1/ identified in mammals and characterized most extensively in hu- mariner elements that are widespread and common in inverte- man: pogo, Tc2, and D34D mariner. We identified members of all brates and lower vertebrates (fish, frogs), but no Tc1-like element three groups in M. lucifugus as well as the first mammalian mem- has previously been identified in mammals (Avancini et al. 1996; ber of the Tc1 group. The pogo lineage is represented by Tigger Leaver 2001; Sinzelle et al. 2005). Consistent with this observa- elements in mammals. It is known to include both eutherian- tion, TBLASTN searches with representative Tc1-like transposases wide and primate-specific families (Robertson 1996; Smit and from fish or nematode return no significant hits in the complete Riggs 1996; Pace and Feschotte 2007). Thus, the presence of Tig- genome of human, mouse, rat, and dog. In contrast, the same ger elements was somewhat expected in M. lucifugus. Indeed, we query returns hundreds of hits from the M. lucifugus WGS data- identified remnants of what appears to be an ancient family of base. A single Tc1-like family, Tc1_1_ML, predominates in the

2 Genome Research www.genome.org Downloaded from genome.cshlp.org onOctober5,2021-Publishedby ae fDAtasoo ciiyi bats in activity transposon DNA of Waves Cold SpringHarborLaboratoryPress eoeRsac 3 Research Genome www.genome.org Figure 1. Results of phylogenetic analyses. (A) hAT transposases, (B) Tc2 transposases, (C) mariner transposases, (D) Tc1 transposases, (E) piggyBac transposases.

Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Ray et al. genome. The Tc1_1_ML consensus spans 1222 bp with 29 bp TIR poson families are associated with multiple nonautonomous with a 5Ј-CACTG-3Ј terminal motif typical of many Tc1-like el- MITE subfamilies. It is beyond the scope of the present study to ements. The ORF occupies position 150–1190 of the consensus describe all of these subfamilies in detail, but we note that non- and encodes a putative transposase of 346 residues. Phylogenetic autonomous elements appear to have outnumbered their au- analyses show that the Tc1_1_ML transposase forms a well- tonomous partners, as is typically observed for DNA transposon supported clade with Tc1-like transposases from fish (Fig. 1D). families in and most other eukaryotes. (Feschotte et al. We detected 3788 Tc1_1_ML fragments of 100 bp or larger in the 2002; Pace and Feschotte 2007) M. lucifugus WGS database. Many of the Tc1_1_ML elements are inserted in (TA) dinucleotide repeats, an insertion preference also Taxonomic distribution of DNA transposons identified observed for other Tc1 elements (Vigdal et al. 2002). Of note is in Myotis lucifugus one subfamily derived from Tc1_1_ML. The consensus of this subfamily has an intact ORF that represents 2/3 (708 bp; 235 aa) PCR-based analyses of 15 taxa representing nine bat families were of the complete ORF for Tc1_1_ML. Five hundred copies of this performed using two oligonucleotide primer pairs for each trans- subfamily of Tc1 are distributed throughout the M. lucifugus ge- poson family (Fig. 2). Results indicate that the families described nome. piggyBac superfamily piggyBac-like elements have been identi- fied in a wide range of species and Entamoeba (Sarkar et al. 2003; Pritham et al. 2005). Among mammals, piggyBac elements have so far only been characterized in the human genome, where they are predominantly repre- sented by two families of nonautono- mous elements (MER85 and MER75) and by several stationary “domesticated” transposases (PGBD1-5 genes) (Sarkar et al. 2003). We identified two families of piggyBac-like elements in M. lucifugus. piggyBac1_ML is defined by a 2626-bp consensus with short TIRs (15 bp) that are very similar to other piggyBac trans- posons. The consensus contains a 1719- bp ORF (position 587–2305) that likely encodes a transposase of 572 residues. Two potentially active elements, with intact ORFs and identifiable TSDs, were located in the available genome data. piggyBac2_ML is similar in length at 2639 bp and with a 583-aa transposase (nt 716–2467). Sequence comparison with other piggyBac transposases and phylo- genetic analyses indicate that the two families are relatively closely related and form a strong cluster with the sea squirt, Ciona intestinalis (Fig. 1E). Myotis piggy- Bac-like elements tend to be flanked by the canonical 5Ј-TTAA-3Ј TSD. piggyBac1_ ML elements number at least 2056 in the M. lucifugus genomes and are likely still mobilizing (see below). However, piggyBac2_ML elements are currently more numerous, with at least 3869 in- stances. In summary, we identified a diverse array of hAT, Tc1/mariner, and piggyBac transposons in the M. lucifugus genome. The Myotis_hAT1 and hAT3_ML ele- ments are by far the most abundant. As with Myotis_hAT1 elements (Ray et al. Figure 2. Electropherogram results from PCR analysis of a taxonomically diverse panel of bat ge- 2007), all of the newly discovered trans- nomes. Oligonucleotide primer pairs are indicated in white.

4 Genome Research www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Waves of DNA transposon activity in bats have limited taxonomic distributions. Tc1_1_ML, Tc2_1_ML, and mar1,amauritiana-like mariner from the ant Messor bouvieri (see Mlmar1 appear to be limited to with the possible also phylogeny in Fig. 1C). The transposase regions are 75% iden- exception of Tc1-like sequences in parnelli (Family tical at the nucleotide level and 85% similar at the protein level. ). All three hAT families are restricted to the genus This level of sequence similarity is remarkable given the evolu- Myotis. piggyBac2_ML shares a similar distribution to the hAT el- tionary distance of their host species, >800 Myr (Gu 1998; Blair ements, but positive results were obtained from the lone repre- and Hedges 2005). We could not detect any traces of other mau- sentative of Miniopteridae, a family that was recently elevated ritiana-like mariner transposases in any of the ∼30 additional from subfamily status within Vesperdilionidae (Miller- mammalian species nor any other species represented Butterworth et al. 2007). Amplification of the miniopterid and in the NCBI databases. mormoopid representatives suggests the need for additional ex- ploration of these taxa. PCR data using piggyBac1_ML-derived Age estimations of DNA transposon families oligonucleotides suggests that these elements are even further restricted. Amplification was obtained in Myotis austroriparius,a Table 1 and Figure 3 show the estimated ages for each family fellow North American bat, but not the Asian representative, based on average divergence from the consensus sequence and -see Meth) 9מ10 ן Myotis horsfieldii. We acknowledge that these results do not con- an estimated neutral mutation rate of ∼2.366 firm that no class 2 transposons are active in other bat lineages. ods). Age analyses of the eight youngest families produce a clear Instead, they lend support to the hypothesis that these particular temporal pattern indicative of successive expansions in the ge- transposons have a limited taxonomic distribution. nome. The earliest invasions began with Tc2-like and mariner-like With the exception of Tigger1_ML, whose amplification elements and were followed by the colonization of Tc1_1_ML, likely predates the split of Chiroptera or perhaps even Laurasia- hAT-like, and piggyBac-like families. theria, TBLASTN and BLASTN queries of chiropteran taxa with In mammals, CpG dinucleotides within repeats are gener- the consensus transposon sequences from M. lucifugus yielded ally heavily methylated, and thus tend to degrade into TpG or only very few significant hits in other bat species. A handful of CpA dinucleotides at a substantially higher rate than other base hits were obtained with the Mlmar1 predicted transposase in the combinations (Coulondre et al. 1978; Razin and Riggs 1980; Xing phyllostomid bats Artibeus jamaicensis and Carollia perspicillata, et al. 2004). Thus, lower divergence estimates were expected and and in the rhinolophid bat, Rhinolophus ferrumenquinum. Inter- observed when calculated using sequences from which the CpG estingly, the hits correspond to cecropia-like and the previously dinucleotides had been removed. CpG to non-CpG mutations described irritans-like mariners (Sinzelle et al. 2006) (Fig. 1C). densities were calculated for all families (Table 2). The youngest Thus, it appears that distinct types of mariners have colonized subfamilies, derived from piggyBac1_ML, produced consistently different bat lineages. Mlmar1 is most closely related to Mbou- lower ratios than any other subfamily. This is most likely ex-

Table 1. Divergence values and estimated activity periods for the DNA transposon groups described CpG Sites Included CpG Sites Removed

Age Age Element Mean Substitution Std. estimateb Mean Std. estimateb family Na distance model error Range (Myr) distance Model error Range (Myr) npiggy1_156 181 0.003 K80 0.0004 0.0000–0.0193 1.1–1.4 0.0021 K80 0.0003 0.0000–0.0193 0.76–1.0 piggyBac1_ORF 15 0.0046 K81uf 0.001 0.0012–0.0145 1.5–2.4 0.0038 F81 0.0011 0.0006–0.0148 1.1–2.1 npiggy1_239 153 0.0055 HKY+G 0.0004 0.0000–0.0311 2.2–2.5 0.0029 F81+G 0.0004 0.0000–0.0220 1.1–1.4 nhAT3_200 200 0.0155 HKY+G 0.0007 0.0000–0.0579 6.3–6.9 0.0079 HKY+G 0.0006 0.0000–0.0436 3.1–3.6 piggyBac2 5 0.0159 HKY 0.0013 0.0121–0.0197 6.2–7.3 0.0100 HKY 0.0008 0.0074–0.0119 3.9–4.6 hAT3_ORF 13 0.0177 HKY+G 0.0013 0.0074–0.0233 6.9–8.0 0.0084 HKY+G 0.0012 0.0040–0.0184 3.0–4.1 nhAT2_730 460 0.0207 TVM+G 0.0003 0.0073–0.0484 8.6–8.9 0.0142 TVM+G 0.0003 0.0014–0.0449 5.9–6.1 hAT1_ORF 42 0.0209 GTR+G 0.0008 0.0132–0.0393 8.5–9.2 0.0149 TVM+G 0.0007 0.0087–0.0280 6.0–6.6 npiggy2_345 92 0.0211 TVM+G 0.001 0.0000–0.0599 8.5–9.3 0.0125 K81uf 0.0008 0.0000–0.0367 5.0–5.6 hAT2_ORF 7 0.0213 HKY+G 0.0019 0.0155–0.0290 8.2–9.8 0.0154 HKY 0.0013 0.0110–0.0205 6.0–7.1 nhAT2_1124 47 0.0235 TVM+G 0.0015 0.0111–0.0717 9.3–10.6 0.0161 TVM+G 0.0012 0.0055–0.0516 6.3–7.3 nhAT2_525 138 0.0246 TVM+G 0.0009 0.0069–0.0520 10.0–10.8 0.0164 K81uf+G 0.0007 0.0020–0.0430 6.6–7.2 nhAT1_186 32 0.0296 K81uf+G 0.0025 0.0096–0.0707 11.5–13.6 0.0242 K81uf+G 0.0026 0.0049–0.0689 9.1–11.3 nhAT1_239a 45 0.0304 K81uf+G 0.0025 0.0086–0.0706 11.8–13.9 0.0215 K81uf 0.0022 0.0000–0.0591 8.2–10.0 nTc1_452 36 0.0346 HKY+G 0.0019 0.0147–0.0744 13.8–15.4 0.0201 HKY 0.0013 0.0077–0.0481 8.0–9.0 nTc1_950 41 0.0365 HKY+G 0.0015 0.0218–0.0593 14.8–16.1 0.0225 HKY 0.0011 0.0122–0.0354 9.0–10.0 nMar_1285 69 0.0449 TVM+G 0.001 0.0296–0.0695 18.6–19.4 0.0283 TVM+G 0.0008 0.0136–0.0454 11.6–12.3 nMar_1265 37 0.0473 TVM+G 0.0017 0.0310–0.0716 19.3–20.7 0.0300 TVM+G 0.0013 0.0181–0.0503 12.1–13.2 Tc2_ORF 6 0.0495 TrN+G 0.0017 0.0452–0.0543 20.2–21.6 0.0397 TrN 0.0017 0.0355–0.0462 16.1–17.5 Mar_ORF 113 0.0566 GTR+G 0.0011 0.0330–0.1046 23.5–24.4 0.0324 GTR+G 0.0008 0.0186–0.0751 13.4–14.0 nTc2_555 53 0.0694 TVM+G 0.0015 0.0464–0.0993 28.7–30.0 0.0425 TVM+G 0.0013 0.0219–0.0638 17.4–18.5 nTc2_527 25 0.0773 GTR+G 0.0026 0.0504–0.1005 31.6–33.8 0.0422 TIM 0.0022 0.0249–0.0672 16.9–18.8 Helibat ———— — 30–36c ——— — —

Table is arranged in increasing divergence estimates. “n” in the name indicates internally deleted, nonautonomous variants of the presumed autono- mous element. aSample size for distance calculation. .9מ10 ן bBased on estimated mutation rate of 2.366 cFrom Pritham and Feschotte (2007).

Genome Research 5 www.genome.org eoeResearch Genome 6 al. et Ray www.genome.org Downloaded from genome.cshlp.org onOctober5,2021-Publishedby Cold SpringHarborLaboratoryPress

Figure 3. Divergence estimates and age ranges for each autonomous and nonautonomous transposon family described.

Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Waves of DNA transposon activity in bats

Table 2. CpG vs. nonCpG mutation ratios for each group of duced activity and extinction. In contrast, the npiggy_156 sub- transposons described herein family appears to be nearing its amplification peak, suggesting Element CpG:nonCpG mutation ratio that piggyBac1_ML may still be active in the Myotis genome. The identification of two full-length elements with intact ORFs in the piggyBac1_ML 4.48 current genome sequence data further support this hypothesis. npiggy1_156 2.57 Further evidence for recent piggyBac1_ML activity is in the obser- npiggy1_239 1.56 vation that we were able to experimentally isolate intact prein- piggyBac2_ML 9.40 npiggy2_345 7.76 tegration sites for npiggy_156 elements in a related species, M. Myotis_hAT1 10.38 austroriparius (Fig. 4). Finally, experimental results from a diverse nhAT1_186 4.91 panel of bat species indicate that piggyBac1_ML is present in only nhAT1_239a 4.55 a subset of species in the genus (Fig. 2). Thus, we can confidently hAT2_ML 8.83 nhAT2_730 10.21 state that this particular subfamily was mobilized within the last nhAT2_1124 11.42 ∼8–12 Myr (Stadelmann et al. 2007). nhAT2_525 8.53 hAT3_ML 8.46 nhAT3_200 10.33 Nested insertion analysis nTc1_452 10.06 nTc1_950 9.06 As a second, independent method to estimate relative ages of the Mar_ML 8.16 elements, we compared the number and identity of nested inser- nMar_1265 8.33 tions within individual transposon copies. Assuming that DNA nMar_1285 8.24 transposon insertions occur fairly randomly throughout the ge- Tc2_ML 8.46 nTc2_527 7.74 nome, older elements should have higher proportions of nested nTc2_555 7.92 insertions than younger elements. Furthermore, while it is pos- Mean 7.79 sible for representatives from older transposon families to be- come nested within younger families if their periods of activity The mean value was calculated using all estimations except those of overlapped, these instances should be relatively rare. RepeatMas- piggyBac1_ML derived elements. See text for details. “n” in the name ker hits for each family of transposons were subjected to a second indicates internally deleted, nonautonomous variants of the presumed autonomous element. RepeatMasker analysis using a custom library that included each of the new transposable elements described here and Ves, a SINE found in the suborder (Borodulina and Kra- plained by their very recent deposition in the genome (see below) merov 1999, 2005). We limited ourselves to these elements be- and bcause they have not resided long enough to accumulate cause their sequences have been well characterized here or else- many mutations of any kind. This hypothesis, however, does not where (Borodulina and Kramerov 1999; Ray et al. 2007) and their explain the relatively low ratios observed for the nhAT1 subfami- target site duplications are easily identified. lies, which we believe were mobilized as much as 10 Myr earlier. Analysis of nested insertions supports the age hierarchy sug- Average CpG dinucleotide mutation densities were approxi- gested by divergence estimates. Among the hAT and piggyBac mately eight times higher than non-CpG sites. We note that this elements examined, only two nested DNA transposon insertions ן ∽ is higher than the rate observed for primates ( 6 ; Xing et al. were recovered—two piggyBac1 insertions inside a Myotis_hAT1 2004), bringing to mind the hypothesis that increased methyla- element (Table 3)—lending support to the hypothesis that these tion, and consequently, high CpG:non-CpG mutation ratios may families are the most recently active DNA transposon families in serve as a genomic defense mechanism against large bursts of the M. lucifugus genome. In contrast, the older elements repre- transposable element activity (Schmid 1998; Xing et al. 2004). sented by the Tc1, Tc2, and mariner families have been subjected to numerous interruptions by the more recently active elements. Current activity by piggyBac1_ML Mlmar1 elements have suffered nested insertions from all other Both genetic distance and CpG analyses above provide evidence families except Tc2_1_ML, which is likely older. It is interesting for very recent and likely ongoing piggyBac1_ML activity. Indeed, that the Tc2 elements have only accumulated two nested inser- a large fraction of the nonautonomous piggyBac1 elements iden- tions by their fellow DNA transposons. This could be explained tified in M. lucifugus are strictly identical (65.7% of the elements by the low occurrence of these elements (n = 593) and the rela- in the npiggy_156 subfamily and 30.7% in the npiggy_239 sub- tively small size of these insertions (∼560 bp). Mariner elements family), which strongly indicates that they have been inserted are present in larger numbers (>2000) and larger average size (778 very recently. As one can observe in Figure 3, most of the DNA bp), and thus present approximately five times the number of transposon families analyzed in this study appear to have potential insertion targets during an only slightly shorter time reached an amplification peak and then entered a period of re- frame. Analyses of Ves SINE nested insertions present a similar

Figure 4. Filled and intact preintegration site sequences for individual nonautonomous piggyBac1_ML elements from M. lucifugus and M. austrori- parius, respectively. Target site duplications for each insertion are shaded. (A) Locus 340_3005. (B) Locus primer sequences (5Ј–3Ј) for the two loci presented are as follows: Mluc_cont1.015340_3005_L— TTGTACCAAAAGGTGCCAAA, Mluc_cont1.015340_3005_R— TTTCCTCATATACCATCCCATTTT; Mluc_cont1.000939_4755_L—CAAAATAAGGGAGAAAGGAAACA, Mluc_cont1.000939_4755_R— GGGCTGAGAAACAAGATCCA. (Mluc) M. lucifugus; (Maus) M. austroriparius.

Genome Research 7 www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Ray et al.

Table 3. Results of nested insertion analysis for each DNA transposon family described one exception to this pattern is Tc1_1_ML, herein which exhibits a relatively low percentage Total bp Percent of nested insertions. It is unclear as to why Total bp verified nested nested this family exists as an outlier. Element family analyzed insertions insertions Insertion type Quantity piggyBac1_ML 620,634 214 0.03% Ves 1 Discussion hAT3_ML 2,649,287 5,029 0.19% Ves 24 piggyBac2_ML 1,495,500 6,829 0.46% Ves 32 DNA transposons are widespread and a di- Myotis_hAT1 4,835,096 27,258 0.56% piggyBac1_ML 2 versity of elements have been recently ac- Ves 126 tive in the genomes of many eukaryotes, in- hAT2_ML 1,217,239 10,814 0.89% Ves 52 cluding lower vertebrates (Aparicio et al. Tc1_1_ML 1,750,122 9,619 0.55% Myotis_hAT1 4 hAT3_ML 2 2002; Koga et al. 2006). In contrast, initial piggyBac1_ML 3 analyses of the transposable element land- Ves 36 scape in the complete human, mouse, rat, Mlmar1 1,613,024 25,975 1.61% Myotis_hAT1 3 and dog genomes have led to the common hAT2_ML 2 hAT3_ML 3 belief that mammalian genomes are devoid piggyBac1_ML 2 of recently active DNA transposons (Lander piggyBac2_ML 1 et al. 2001; Waterston et al. 2002; Gibbs et Tc1_1_ML 2 al. 2004; Lindblad-Toh et al. 2005). Sup- Ves 98 Tc2_ML 335,994 5,964 1.78% piggyBac1_ML 1 porting this idea, a detailed investigation of Tc1_1_ML 1 the evolutionary history of human DNA Ves 22 transposons revealed that many trans- posons were intensively active during early (Ves) VesSINE (Borodulina and Kramerov 1999). primate evolution (∼80–40 million years ago [Mya]), but this activity ceased approxi- picture. The four most recently active elements, piggyBac1_ML, mately ∼40 Mya (Pace and Feschotte 2007). Analyses of the ro- piggyBac2_ML, hAT2_ML, and hAT3_ML have been subjected dent and carnivore lineages reveal a similar trend, with no DNA only to Ves insertions. transposon family identified in either mouse or dog that is sig- Nested self-insertions are theoretically possible and were ex- nificantly younger than 50 Myr (Waterston et al. 2002; J. Pace pected when the analysis was performed. Indeed, there were and C. Feschotte, unpubl.). some possible instances observed. However, in each of these We have identified seven new families of DNA transposons cases the TSDs were not clearly identifiable. Because TSD identi- in the M. lucifugus genome which, in stark contrast to observa- fication was one a priori criterion for assessing the presence of tions made for other mammalian genomes, have apparently been any nested insertion (see Methods), we were forced to exclude active within the last 40 Myr. Several lines of evidence support these from Table 3. this conclusion. First, the elements appear to be almost exclu- Finally, one would expect older elements to have accumu- sively limited to Vespertilionidae, and in some cases to selected lated more nested insertions than more recently mobilized ele- taxa within the family. Second, the level of sequence divergence ments. To test this hypothesis, we examined the percent nested among copies (and to their respective consensus sequences) sug- insertion content for all instances of each family of elements gests that these families were active from ∼36 Mya to the present. recovered from the M. lucifugus genome. Figure 5 illustrates a Analysis of nested insertions suggests a hierarchical pattern of clear pattern of younger elements (i.e., piggyBac-like and hAT- insertion consistent with sequence divergence estimates. Third, like) having lower relative nested insertion content when com- two potentially full-length and intact piggyBac1_ML transposons pared with older elements from the Tc1/mariner superfamily. The could be identified in the available genome sequence data, sup- porting the hypothesis that the expansion of these elements and their nonautonomous relatives is ongoing. Finally, recent activ- ity of piggyBac1_ML was verified by recovering “empty” ortholo- gous sites in a closely related Myotis species and by the limited taxonomic distribution of piggyBac1_ML to North American Myo- tis species. Together with the recent study reporting the massive amplification of HeliBats 30–36 Mya in the lineage of M. lucifugus (Pritham and Feschotte 2007), our data demonstrate that the genomes of vespertilionid bats have been subjected to multiple waves of amplification of diverse DNA transposons, from ∼40 Mya to the present. One lineage of piggyBac, piggyBac1_ML,is likely still active. To our knowledge, this is the first documented evidence of the recent activity of a diverse population of DNA transposons in any . At least 3.5% of the available genome data is made up of sequence derived from the transposons described herein. It Figure 5. Proportion of insertions nested within instances of each DNA transposon family described. Families are arranged along the X-axis in should be noted, however, that this number is very likely an order of age based on genetic divergence estimates in Table 3. Precise underestimate. The previous study of hAT1 elements in M. lu- values for can be found in Table 3. cifugus revealed an abundance of nonautonomous nhAT families,

8 Genome Research www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Waves of DNA transposon activity in bats which have similar TIRs, but little internal sequence similarity that bats exhibit a variety of life history traits unique among with each other or with larger autonomous elements (Ray et al. mammals (flight, high population sizes and densities, torpor) 2007). Because of this lack of similarity, RepeatMasker fails to that provide potential avenues for future research. One of the identify a majority of the previously identified nhAT elements most intriguing of these characteristics is the fact that bats are when the consensus for Myotis_hAT1 is used as the part of the notorious viral reservoirs (Calisher et al. 2006). Like viruses, DNA database. Indeed, masking the genome using the Myotis_hAT1 transposons are examples of genomic parasites, and the ability of sequence produces only ∼21,000 hits greater than 100 bp. Creat- bats to safely harbor large loads of a variety of viruses may suggest ing a database that contains the full-length consensus and the similar genetic tolerance with regard to DNA transposons. On the nhAT elements described by Ray et al. (2007) as a single repeat other hand, viruses remain the best candidates as potential vec- database increases the number of hits to >96,000. We believe that tors for the horizontal introduction of DNA transposons and the other transposon families described herein will follow a simi- other TEs (Miller and Miller 1982; Fraser et al. 1983; Jehle et al. lar pattern, and therefore, our estimate of the amount of DNA 1998; Piskurek and Okada 2007). Thus, the propensity of bats to derived from DNA transposons is almost certainly too low. Add tolerate massive and diverse viral infections may have facilitated the impact of HeliBats, which accounts for at least an additional the recurrent horizontal introduction of DNA transposons and/ 3% of the M. lucifugus genome (Pritham and Feschotte 2007), and or their evolutionary persistence in vespertilionid bats. the as yet uncharacterized influence of retrotransposons suggests In conclusion, we have provided evidence that the previ- an extremely dynamic genomic landscape over the past 40 Myr ously identified trend toward DNA transposon extinction in of vespertilionid bat evolution. mammals is not universal and that a wide diversity of DNA trans- Myotis is one of the most species-rich of all mammalian gen- posons have been active throughout the diversification of the era, and repeated waves of transposon activity suggest a mecha- vespertillionid bats, that is, within the last 40 Myr. Furthermore, nism for generating the genetic variability necessary to produce we propose that the genus Myotis is an excellent candidate for its tremendous species diversity. Interestingly, a recent analysis studying the impact and influence of mobile elements, especially of Myotis phylogeny based on nuclear and mitochondrial DNA class 2 elements, on the evolution of genomic, species, and eco- sequences becomes more intriguing when one considers the data logical diversity in mammals. Concomitantly, studies of bats presented here. Stadelmann et al. (2007) found that a burst of may represent a unique opportunity to investigate life history Myotis diversification occurred ∼12–13 Mya. These dates corre- characteristics that make some organisms more susceptible to spond well to the estimated time during which the most active transposon activity than others. DNA transposon families were expanding in the Myotis genome (Fig. 2). The young age and narrow taxonomic distribution of all Methods (but one) of the transposon families identified in M. lucifugus raises the puzzling question of their evolutionary origin. Based Transposon identification and consensus reconstruction on the data presented here and elsewhere (Pace and Feschotte and classification 2007; Ray et al. 2007), we currently favor the hypothesis that We carried out initial computational searches from January most of these families hail from repeated episodes of horizontal through April 2007 against the M. lucifugus WGS data deposited introduction of founder autonomous transposons from yet un- at NCBI (http://www.ncbi.nlm.nih.gov/, GenBank accession no. known source(s) occurring at different evolutionary time points AAPE00000000; K. Lindblad-Toh, pers. comm.). At that time, the in the lineage of M. lucifugus. The only alternative hypothesis WGS data set represented ∼73% of the unassembled M. lucifugus would be that each family arose from active transposons that genome with contig sizes averaging ∼2.4 kb (Pritham and Fe- were vertically inherited during chordate evolution. We consider schotte 2007). The WGS data was queried using TBLASTN to this hypothesis less likely, because the lack of closely related el- detect the presence of coding sequences related to all known ements in any of the 30+ other mammalian species (including DNA transposon superfamilies (list of queries available upon re- three nonvespertilionid bats) for which complete or large quest). When significant hits were obtained (generally E- מ amounts of genome sequence data are currently available would value << 10 5), complete sequence entries spanning the top 10– entail that these elements have persisted in the lineage of ves- 50 subject hits (depending on the overall abundance and se- pertilionid bats, while being systematically lost in all other lin- quence heterogeneity of the hits) were retrieved from the eages. Under a scenario dominated by horizontal introduction, it database and aligned using a local installation of MUSCLE (Edgar appears that mariner, Tc1, and Tc2-like elements colonized the 2004) or CLUSTAL (Thompson et al. 1994). Sequences were ancestral vespertilionid genome prior to diversification of the trimmed to eliminate unalignable flanks and leave only TSDs, family. Later invasions of the hAT-like and piggyBac-like elements noncoding regions, and the presumed coding regions of the el- ements. Complete alignments used to determine the full-length may have occurred prior to diversification of Myotis and, in the consensus sequence for each family are presented as Supplemen- case of piggyBac1_ML, New World Myotis. Such a scenario of re- tal Alignments 1–8. Consensus sequences were derived from each peated horizontal transfer is consistent with the well-known pro- multiple alignment based on a majority rule. For each consensus, pensity and the apparent necessity of class 2 elements to undergo coding sequences were predicted using ORF Finder (http:// horizontal transfers (Hartl et al. 1997; Robertson 2002). A more www.ncbi.nlm.nih.gov/projects/gorf/) and, if necessary, refined thorough analysis of the distribution of these elements in bats manually at ambiguous positions in the consensus using mul- and other vertebrates will shed additional light onto their origin. tiple alignments of protein sequences from individual copies. These data also raised the puzzling question as to why bats Unless previous nomenclature had been established, families and (and Vespertilionidae in particular) exhibit such a high level of subfamilies were disclosed according to the standard principles of recent transposon activity, while these types of elements have TE classification (Wicker et al. 2007). seemingly gone extinct in other mammalian lineages examined. hAT3_ML was discovered independently using methods de- While we are currently unable to answer this question, we note scribed in Arensburger et al. (2005). After its initial characteriza-

Genome Research 9 www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Ray et al. tion, it was subjected to the same methods as all other elements a second RepeatMasker analysis using a custom library that in- described herein. All transposon consensus sequences used for cluded each of the new transposable elements described here and this manuscript have been deposited in Repbase (http:// Ves. We validated each nested insertion manually by checking www.girinst.org/repbase/index.html). for the presence of the expected target-site duplications.

Age estimation of M. lucifugus transposons Identification of intact preintegration sites in M. austroriparius We estimated ages for each transposon family by extracting full- To test the hypothesis that piggyBac1_ML has been recently active or near full-length ORF’s based on the coordinates of the repeats in Myotis, we designed oligonucleotide primers to survey the drawn from the output files of a locally implemented version of presence/absence of 15 individual insertions of a nonautono- RepeatMasker 3.1.6. Additionally, we hypothesized that nonau- mous piggyBac1_ML subfamily (npiggy_156)inM. austroriparius, ∼ tonomous subfamilies were likely mobilized by autonomous el- which diverged from M. lucifugus 8–12 Mya (Stadelmann et al. ements sharing the same TIRs, and thus were deposited during 2007). Primers were designed in the genomic regions flanking the the same time period as the autonomous elements. We therefore elements using sequence data from M. lucifugus and tested on a collected representatives from a limited number of nonautono- panel of 10 individuals from natural populations of M. austrori- mous derivatives for all families. All extracted elements were parius. Details on sample collection, DNA extraction, amplifica- aligned with their respective consensus sequence using MUSCLE tion, cloning, and sequencing are as previously described (Ray et (Edgar 2004). The most appropriate model for estimating nucleo- al. 2007). Sequences from the two intact preintegration site am- tide divergence from each consensus was determined using plicons shown in Figure 3 have been deposited in GenBank un- MODELTEST v3.7 (Posada and Crandall 1998) and genetic dis- der accession nos. EU177095 and EU177096. tances were calculated in PAUP* v4.0b10 (Swofford 2002). Specific neutral mutation rate estimates for bats are not cur- Taxonomic distribution of M. lucifugus transposons rently available. Thus, we utilized published sequences from rep- To investigate the presence of related transposon families in resentative vespertilionid (Nycticeinops schlieffeni, Myotis tricolor, other chiropteran or mammalian lineages, we used BLASTN or Scotophilus dinganii, Eptesicus hottentotus, Cistugo lesueuri, and Cis- TBLASTN with nucleotide or amino acid sequence queries corre- tugo seabrai) and miniopterid ( natalensis, M. inflatus, sponding to each family of transposon in separate searches of M. fraterculus, M. australis, and M. macrocneme) taxa to obtain an custom databases representing the following taxa: Chiroptera estimate. Specifically, we obtained sequence data generated by (taxid:9397, excluding Myotis lucifugus), Xenarthra (taxid:9348), Eick et al. (2005) for four introns (AJ865401-5, AJ865438-43, Afrotheria (taxid:311790), Laurasiatheria (taxid:314145), Euarch- AJ865646-50, AJ865683-88, AJ866297-301, AJ866329-34, AJ866349-53, AJ866383-87). Table 4. Oligonucleotide primer sequences used to investigate the taxonomic We constructed a concatenated alignment distribution of the DNA transposons described using MUSCLE (Edgar 2004) and deter- mined the most appropriate evolutionary Annealing ؇ a ؅ ؅ model for the data using MODELTEST v3.7 Primer temp. ( C) Oligonucleotide sequence (5 –3 ) (Posada and Crandall 1998), TIM+G ⌫ piggyBac1_593+ 58 GTCGCAGCATTCAGACTATAGTGATG ( = 1.2597). We then estimated the average piggyBac1_1424- CGCTTTCCTTCGCCGCAATAAATTTCC genetic distance between families Miniop- piggyBac1_1424+ 58 GGAAATTTATTGCGGCGAAGGAAAGCG teridae and Vespertilionidae as defined by piggyBac1_2216- GCACATGTAGCGTGTCTCACTCGC Eick et al. (2005) using PAUP* v4.0b10 piggyBac2_774+ 58 CCAACGAAACTGATACACTTCCGG (Swofford 2002). By incorporating the esti- piggyBac2_1536- GGTGTCACTCTCGCACACCATCC piggyBac2_1536+ 58 GGATGGTGTGCGAGAGTGACACC mated divergence time between these two piggyBac2_2340- GCACAAACTCTGCAGGCTCTCG ∼ families, 45 Mya (Eick et al. 2005; Miller- hAT1_1307+ 58 GGTTGGCCATTATTGCTAGATATTCTGATGG Butterworth et al. 2007), we arrived at an hAT1_1837- GCTGACCCACTTATGATCATTGATCTGAGG hAT1_1837+ 58 CCTCAGATCAATGATCATAAGTGGGTCAGC ,9מ10 ן estimated mutation rate of 2.366 which was then used to determine likely pe- hAT1_2263- CAATGATAATCAAATGGTACGAAAACTAGAAATTAGTGAC riods of activity. Representative alignments hAT2_309+ 58 GCCCTTCTAACTCTACCTGAGCG hAT2_973- GGTCTTTGATGTGCCTCTTGAGGTTTGG for several nonautonomous subfamilies are hAT2_973+ 59 CCAAACCTCAAGAGGCACATCAAAGACC presented in Supplemental Figure S1. All hAT2_1503- GTGACCTTCAGAGATATGCACGAGC alignments are available upon request. hAT3_1739+ 58 AATCTCAGCCGTCACTTTGC We estimated ages for each subfamily hAT3_2490- CAAGCTTTACGACTGGATTAACAA of elements using sequences as recovered hAT3_2519+ 58 TTCATTCGAGCAAAGGGACT hAT3_2959- CATGCAGGTCATCCAGTGAT from the genome and sequences from Tc1_483+ 59 ATGGGATATGGAAGCCGAAGGCC which the CpG dinucleotides had been re- Tc1_639- CCAAATTCTCACTCTGCCATCTGC moved (see Table 1). We also estimated the Tc1_639+ 59 GCAGATGGCAGAGTGAGAATTTGG relative ratio of CpG to non-CpG mutations Tc1_1151- CCACCTTTTGCCTTCAATACTGCG for each family (Table 2). This was accom- Tc2_248+ 60 GGACCAGAGCATTCTGAGCCC plished using a modified Perl script origi- Tc2_928- GGAGGCAGTTTAGTACCATCTGCAC Tc2_928+ 60 GTGCAGATGGTACTAAACTGCCTCC nally designed to estimate CpG dinucleo- Tc2_1471- GTGAGATCATCCTCAGTTCCGTCC tide mutation rates in primate Alu elements Mar_185+ 60 CGTGCCAAAAAAAGAGCATTTGCGG (Xing et al. 2004) and modified for this work. Mar_696+ GGTCAACCATCAACATCGACTGC Mar_696- 60 GCAGTCGATGTTGATGGTTGACC Nested insertion analysis Mar_1145- CAGGTAATTTGTGGATACCGTCCC We extracted all RepeatMasker hits for each Numbers in the primer name indicate their location in the respective consensus sequence. family of transposons and subjected each to aAnnealing temperature for each primer pair indicated in Figure 3.

10 Genome Research www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Waves of DNA transposon activity in bats

Table 5. Origin of tissue and DNA samples Blair, J.E. and Hedges, S.B. 2005. Molecular phylogeny and divergence times of Taxon Family Museum Voucher ID deuterostome . Mol. Biol. Evol. 22: 2275–2284. Myotis austroriparius Vespertilionidae Louisiana State University M8134 Borodulina, O.R. and Kramerov, D.A. 1999. Myotis horsfieldii Vespertilionidae Louisiana State University M4424 Wide distribution of short interspersed Corynorhinus rafinesquii Vespertilionidae Louisiana State University M8121 elements among eukaryotic genomes. FEBS Eptesicus furinalis Vespertilionidae Texas Tech University TK18815 Lett. 457: 409–413. Borodulina, O.R. and Kramerov, D.A. 2005. Kerivoula papillosa Vespertilionidae Texas Tech University TK152023 PCR-based approach to SINE isolation: Nycticeius humeralis Vespertilionidae Louisiana State University M8142 Simple and complex SINEs. Gene Minopterus sp. Miniopteridae Texas Tech University TK152087 349: 197–205. Artibeus jamaicensis Phyllostomidae Texas Tech University TK27682 Calisher, C.H., Childs, J.E., Field, H.E., Holmes, Hipposideros cervinus Hipposideridae Texas Tech University TK152133 K.V., and Schountz, T. 2006. Bats: Important Rhinolophus borneoensis Rhinolophidae Texas Tech University TK152256 reservoir hosts of emerging viruses. Clin. Natalas stramineus Natalidae Texas Tech University TK101001 Microbiol. Rev. 19: 531–545. Thyroptera tricolor Thyropteridae Texas Tech University TK18818 Coulondre, C., Miller, J.H., Farabaugh, P.J., and Pteronotus parnelli Mormoopidae Texas Tech University TK21800 Gilbert, W. 1978. Molecular basis of base Macroglossus sobrinus Pteropodidae Texas Tech University TK152047 substitution hotspots in Escherichia coli. Balionycteris sp. Pteropodidae Texas Tech University TK152117 Nature 274: 775–780. Demattei, M.V., Auge-Gouillou, C., Pollet, N., Hamelin, M.H., Meunier-Rotival, M., and Bigot, Y. 2000. Features of the mammal mar1 transposons in the human, sheep, cow, and ontoglires (taxid:314146), Mammalia (taxid:40674). We also que- mouse genomes and implications for their evolution. Mamm. ried the trace data from the sequencing efforts for the flying fox Genome 11: 1111–1116. Pteropus vampyrus (7,782,409 sequences averaging 970 bp). Edgar, R.C. 2004. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32: 1792–1797. Oligonucleotide primers complementary to portions of the Eick, G.N., Jacobs, D.S., and Matthee, C.A. 2005. A nuclear DNA coding region of each autonomous element (Table 4) were used phylogenetic perspective on the evolution of echolocation and to survey a taxonomically diverse panel of bat genomic DNAs historical biogeography of extant bats (chiroptera). Mol. Biol. Evol. (Table 5) via PCR. Reaction conditions were as described previ- 22: 1869–1886. Feschotte, C. and Pritham, E.J. 2007. DNA transposons and the ously (Ray et al. 2007) and annealing temperatures for each evolution of eukaryotic genomes. Annu. Rev. Genet. 41: 331–368. primer pair are found in Table 4. Feschotte, C., Zhang, X., and Wessler, S.R. 2002. Miniature inverted-repeat transposable elements and their relationship to Phylogenetic analyses established DNA transposons. In Mobile DNA II (eds. N.L. Craig et al.), pp. 1147–1158. ASM Press, Washington, DC. Amino acid sequences for known transposases from each super- Fraser, M.J., Smith, G.E., and Summers, M.D. 1983. Acquisition of host family were obtained from database searches and aligned using cell DNA sequences by baculoviruses: Relationship between host DNA insertions and FP mutants of Autographa californica and MUSCLE (Edgar 2004). Phylogenetic analyses were conducted us- Galleria mellonella nuclear polyhedrosis viruses. J. Virol. 47: 287–300. ing MEGA 4 (Kumar et al. 2004). Neighbor-joining trees were Gibbs, R.A., Weinstock, G.M., Metzker, M.L., Muzny, D.M., Sodergren, constructed using the equal input model with 5000 bootstrap E.J., Scherer, S., Scott, G., Steffen, D., Worley, K.C., Burch, P.E., et al. replicates. 2004. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428: 493–521. Gu, X. 1998. Early metazoan divergence was about 830 million years Acknowledgments ago. J. Mol. Evol. 47: 369–371. Hartl, D.L., Lohe, A.R., and Lozovskaya, E.R. 1997. Modern thoughts on This work was supported by the Eberly College of Arts and Sci- an ancyent marinere: Function, evolution, regulation. Annu. Rev. ences at West Virginia University (D.A.R.), by start-up funds from Genet. 31: 337–358. Jehle, J.A., Nickel, A., Vlak, J.M., and Backhaus, H. 1998. Horizontal UT Arlington (E.J.P. and C.F.) and grant R01GM77582-01 from escape of the novel Tc1-like lepidopteran transposon TCp3.2 into the National Institute of Health (C.F.). N.L.C. is an investigator of Cydia pomonella granulovirus. J. Mol. Evol. 46: 215–224. the Howard Hughes Medical Institute. We thank D. Hedges for Koga, A., Iida, A., Hori, H., Shimada, A., and Shima, A. 2006. Vertebrate modifying his original Perl script for the CpG analyses, S. DiFazio DNA transposon as a natural mutator: The medaka fish Tol2 element contributes to genetic variation without recognizable traces. Mol. for contributing Perl scripts to aid in processing RepeatMasker Biol. Evol. 23: 1414–1419. output, and R. Baker (Texas Tech University) and R. Stevens Kumar, S., Tamura, K., and Nei, M. 2004. MEGA3: Integrated software (Louisiana State University) for contributing tissue and DNA for molecular evolutionary genetics analysis and sequence samples for this study. We thank the Broad Institute Genome alignment. Brief. Bioinform. 5: 150–163. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Sequencing Platform and Whole genome Assembly group, K. Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. Lindblad-Toh, and F. diPalma for making the data available. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860–921. Leaver, M.J. 2001. A family of Tc1-like transposons from the genomes of References fishes and frogs: Evidence for horizontal transmission. Gene 271: 203–214. Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J.-m., Dehal, P., Lindblad-Toh, K., Wade, C.M., Mikkelsen, T.S., Karlsson, E.K., Jaffe, Christoffels, A., Rash, S., Hoon, S., Smit, A., et al. 2002. D.B., Kamal, M., Clamp, M., Chang, J.L., Kulbokas 3rd, E.J., Zody, Whole-genome shotgun assembly and analysis of the genome of M.C., et al. 2005. Genome sequence, comparative analysis and Fugu rubripes. Science 297: 1301–1310. haplotype structure of the domestic dog. Nature 438: 803–819. Arensburger, P., Kim, Y.J., Orsetti, J., Aluvihare, C., O’Brochta, D.A., and Miller, D.W. and Miller, L.K. 1982. A virus mutant with an insertion of Atkinson, P.W. 2005. An active transposable element, Herves, from a copia-like transposable element. Nature 299: 562–564. the African malaria mosquito Anopheles gambiae. Genetics Miller-Butterworth, C.M., Murphy, W.J., O’Brien, S.J., Jacobs, D.S., 169: 697–708. Springer, M.S., and Teeling, E.C. 2007. A family matter: Conclusive Avancini, R.M., Walden, K.K., and Robertson, H.M. 1996. The genomes resolution of the taxonomic position of the long-fingered bats, of most animals have multiple members of the Tc1 family of miniopterus. Mol. Biol. Evol. 24: 1553–1561. transposable elements. Genetica 98: 131–140. Pace II, J.K. and Feschotte, C. 2007. The evolutionary history of human Bennetzen, J.L. 2000. Transposable element contributions to plant gene DNA transposons: Evidence for intense activity in the primate and genome evolution. Plant Mol. Biol. 42: 251–269. lineage. Genome Res. 17: 422–432.

Genome Research 11 www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Ray et al.

Piskurek, O. and Okada, N. 2007. Poxviruses as possible vectors for genome of the amphibian Xenopus tropicalis. Gene 349: 187–196. horizontal transfer of from reptiles to mammals. Proc. Sinzelle, L., Chesneau, A., Bigot, Y., Mazabraud, A., and Pollet, N. 2006. Natl. Acad. Sci. 104: 12046–12051. The mariner transposons belonging to the irritans subfamily were Posada, D. and Crandall, K.A. 1998. MODELTEST: Testing the model of maintained in chordate genomes by vertical transmission. J. Mol. DNA substitution. 14: 817–818. Evol. 62: 53–65. Pritham, E.J. and Feschotte, C. 2007. Massive amplification of Smit, A.F. and Riggs, A.D. 1996. Tiggers and DNA transposon fossils in rolling-circle transposons in the lineage of the bat Myotis lucifugus. the human genome. Proc. Natl. Acad. Sci. 93: 1443–1448. Proc. Natl. Acad. Sci. 17: 422–432. Stadelmann, B., Lin, L.K., Kunz, T.H., and Ruedi, M. 2007. Molecular Pritham, E.J., Feschotte, C., and Wessler, S.R. 2005. Unexpected diversity phylogeny of New World Myotis (Chiroptera, Vespertilionidae) and differential success of DNA transposons in four species of inferred from mitochondrial and nuclear DNA genes. Mol. entamoeba protozoans. Mol. Biol. Evol. 22: 1751–1763. Phylogenet. Evol. 43: 32–48. Ray, D.A., Pagan, H.J.T., Thompson, M.L., and Stevens, R.D. 2007. Bats Swofford, D.L. 2002. Phylogenetic analysis using parsimony (and other with hATs: Evidence for recent DNA transposon activity in genus methods). Sinauer Associates, Sunderland, MA. myotis. Mol. Biol. Evol. 24: 632–639. Thompson, J.D., Higgins, D.G., and Gibson, T.J. 1994. CLUSTAL W: Razin, A. and Riggs, A.D. 1980. DNA methylation and gene function. Improving the sensitivity of progressive multiple sequence Science 210: 604–610. alignment through sequence weighting, position-specific gap Robertson, H.M. 1996. Members of the pogo superfamily of penalties and weight matrix choice. Nucleic Acids Res. DNA-mediated transposons in the human genome. Mol. Gen. Genet. 22: 4673–4680. 252: 761–766. Vigdal, T.J., Kaufman, C.D., Izsvak, Z., Voytas, D.F., and Ivics, Z. 2002. Robertson, H.M. 2002. Evolution of DNA transposons in eukaryotes. In Common physical properties of DNA affecting target site selection of Mobile DNA II (eds. N.L. Craig et al.), pp. 1093–1110. ASM Press, sleeping beauty and other Tc1/mariner transposable elements. J. Mol. Washington, DC. Biol. 323: 441–452. Ruvolo, V., Hill, J.E., and Levitt, A. 1992. The Tc2 transposon of Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Caenorhabditis elegans has the structure of a self-regulated element. Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., DNA Cell Biol. 11: 111–122. et al. 2002. Initial sequencing and comparative analysis of the SanMiguel, P. and Bennetzen, J.L. 1998. Evidence that a recent increase mouse genome. Nature 420: 520–562. in genome size was caused by the massive amplification of Wicker, T., Sabot, F., Hua-Van, A., Bennetzen, J.L., Capy, P., Chalhoub, intergene retrotransposons. Ann. Bot. (Lond.) 82: 37–44. B., Flavell, A., Leroy, P., Morgante, M., Panaud, O., et al. 2007. A Sarkar, A., Sim, C., Hong, Y.S., Hogan, J.R., Fraser, M.J., Robertson, H.M., unified classification system for eukaryotic transposable elements. and Collins, F.H. 2003. Molecular evolutionary analysis of the Nat. Rev. Genet. 8: 973–982. widespread piggyBac transposon family and related “domesticated” Xing, J., Hedges, D.J., Han, K., Wang, H., Cordaux, R., and Batzer, M.A. sequences. Mol. Genet. Genomics 270: 173–180. 2004. mutation spectra: Molecular clocks and the effect Schmid, C.W. 1998. Does SINE evolution preclude Alu function? Nucleic of DNA methylation. J. Mol. Biol. 344: 675–682. Acids Res. 26: 4541–4550. Sinzelle, L., Pollet, N., Bigot, Y., and Mazabraud, A. 2005. Characterization of multiple lineages of Tc1-like elements within the Received September 27, 2007; accepted in revised form March 10, 2008.

12 Genome Research www.genome.org Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press

Multiple waves of recent DNA transposon activity in the bat, Myotis lucifugus

David A. Ray, Cedric Feschotte, Heidi J.T. Pagan, et al.

Genome Res. published online March 13, 2008 Access the most recent version at doi:10.1101/gr.071886.107

Supplemental http://genome.cshlp.org/content/suppl/2008/04/08/gr.071886.107.DC1 Material

P

License

Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here.

Advance online articles have been peer reviewed and accepted for publication but have not yet appeared in the paper journal (edited, typeset versions may be posted when available prior to final publication). Advance online articles are citable and establish publication priority; they are indexed by PubMed from initial publication. Citations to Advance online articles must include the digital object identifier (DOIs) and date of initial publication.

To subscribe to Genome Research go to: https://genome.cshlp.org/subscriptions

Copyright © 2008, Cold Spring Harbor Laboratory Press