Retrotransposons
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Natl. Acad. Sci. USA Vol. 95, pp. 11284–11289, September 1998 Evolution Determining and dating recent rodent speciation events by using L1 (LINE-1) retrotransposons OLIVIER VERNEAU*†,FRANC¸OIS CATZEFLIS‡, AND ANTHONY V. FURANO*§ *Laboratory of Molecular and Cellular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0830; and ‡Institut des Sciences de l’Evolution, Case Courrier 064, Universite´Montpellier 2, 34095 Montpellier, France Communicated by Herbert Tabor, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, July 24, 1998 (received for review May 22, 1998) ABSTRACT Phylogenies based on the inheritance of This difference stems from the distinct biological properties shared derived characters will be ambiguous when the shared of these elements. L1 elements are prolific, self-replicating characters are not the result of common ancestry. Such mammalian retrotransposons that rapidly generate distinct characters are called homoplasies. Phylogenetic analysis also novel subfamilies consisting mostly of defective (pseudo) can be problematic if the characters have not changed suffi- copies (see legend to Fig. 1). The defective subfamily members ciently, as might be the case for rapid or recent speciations. are retained in the genome and diverge from each other with The latter are of particular interest because evolutionary time at the pseudogene (neutral) rate. The rapid generation of processes may be more accessible the more recent the specia- novel L1 characters keeps pace with speciation, and the tion. The repeated DNA subfamilies generated by the mam- sequence divergence of the various defective subfamily mem- malian L1 (LINE-1) retrotransposon are apparently ho- bers theoretically permits the dating of the speciations (4). By moplasy-free phylogenetic characters. L1 retrotransposons contrast, although SINE elements can be organized into are transmitted only by inheritance and rapidly generate novel subfamilies, they are not self-replicating and there are not variants that produce distinct subfamilies of mostly defective enough distinct SINE families to generate high-resolution copies, which then ‘‘age’’ as they diverge. Here we show that trees (11). Although individual SINE insertions are very robust the L1 character can both resolve and date recent speciation phylogenetic characters and can generate detailed phylog- events within the large group of very closely related rats enies, they cannot be used to date phylogenetic events (12). known as Rattus sensu stricto. This lineage arose 5–6 million Here we demonstrate that the L1 phylogenetic character can years ago (Mya) and subsequently underwent two episodes of determine and date phylogenetic events within Rattus sensu speciation: an intense one, '2.7 Mya, produced at least five stricto. These rodents consist of '50 very closely related taxa lineages in <0.3 My; a second began '1.2 Mya and may still that evolved very recently and have been largely refractory to be continuing. phylogenetic analysis (13–16). We found that the Rattus sensu stricto lineage, which we redefine partially here, emerged Cladistics is a phylogenetic approach for classifying organisms '7.5–5.5 million years ago (Mya). Rattus sensu stricto then into taxa based on shared inherited characters (1). The em- underwent two intense speciations: one occurred '2.7 Mya phasis on inherited couples taxonomic classification to the and generated five Rattus lineages in less than 0.3 My; a second evolutionary history of the examined taxa. This makes cladis- began '1.2 Mya and may still be continuing. tics intellectually appealing since phylogeny is based on gene- alogy. The shared characters can range from classical mor- MATERIALS AND METHODS phological and biochemical to molecular sequence data. However, the major problem for cladistics is determining Biological Specimens. The rodent samples (except R. nor- whether a shared character is inherited or arose independently vegicus from New York and Mus musculus domesticus,a because of convergence, parallelisms, or reversion to an an- laboratory strain), were from the collection of the Institut des cestral state. Noninherited shared characters are called ho- Sciences de l’Evolution of Montpellier II (17). The species moplasies, and they can lead to multiple, equally likely phy- names, registry numbers, geographical localities, and collec- logenetic trees or, in extreme cases, a single incorrect tree (e.g., tors of the different specimens have been described (6). We see ref. 2). An additional problem occurs for rapid speciations follow the nomenclature and taxonomy presented in ref. 18 because phylogenetic characters may not have changed suffi- with the following exceptions as explained in ref. 6: Niviventer ciently (3). Recently we (4–6) and others (7–11) have shown, niviventer, Rattus flavipectus, R. cf moluccarius, and R. satarae. respectively, that L1 (LINE, long interspersed) and SINE Of the 26 species of Rattus sensu lato examined, 4 belong to the (short interspersed) repeated DNA elements apparently are Maxomys genus, 4 belong to Niviventer, 2 belong to Leopol- homoplasy-free characters. However, in most cases the re- damys, 1 each belong to Berylmys, Sundamys, and Bandicota, peated elements have been used differently as phylogenetic and 13 belong to Rattus. For outgroup comparisons, we characters. Although the phylogenetic distribution of distinct examined four Murinae species: Mus musculus domesticus, SINE families has been informative (11), usually the presence or absence of SINE element insertions at particular loci has Abbreviations: My, million years; Mya, My ago; UTR, untranslated been used as a phylogenetic character. While L1 elements also region; LINE, long interspersed repeated DNA element; SINE, short can be used this way, the presence or absence of distinct interspersed repeated DNA element; L1, LINE-1. Data deposition: The sequences reported in this paper have been multicopy L1 subfamilies has been scored as the phylogenetic deposited in the GenBank database (accession nos. AJ004354– character. AJ004559). †Present address: Laboratoire de Biologie Animale, UMR 5555 du The publication costs of this article were defrayed in part by page charge Centre National de la Recherche Scientifique, Centre de Biologie et d’Ecologie Tropicale et Me´diterrane´enne,66860 Universite´de Per- payment. This article must therefore be hereby marked ‘‘advertisement’’ in pignan Cedex, France. accordance with 18 U.S.C. §1734 solely to indicate this fact. §To whom reprint requests should be addressed at: National Institutes © 1998 by The National Academy of Sciences 0027-8424y98y9511284-6$2.00y0 of Health, Building 8, Room 203, 8 Center Drive, MSC 0830, PNAS is available online at www.pnas.org. Bethesda, MD 20892-0830. e-mail: [email protected]. 11284 Downloaded by guest on September 30, 2021 Evolution: Verneau et al. Proc. Natl. Acad. Sci. USA 95 (1998) 11285 Aethomys namaquensis, Thamnomys gazellae, and Conilurus probes defined from this part of the alignment. Subfamilies penicillatus; and four non-Murinae species: Cricetomys gam- (e.g., L1rat290 and L1rat300, L1rat380 and L1rat390) that are bianus, Tatera indica, Akodon torques, and Arvicola terrestris. identical in this region of the alignment are clearly distin- General Techniques. DNA was purified from preserved guished in regions not displayed in Fig. 1. Oligonucleotide tissues of the above specimens as described (6). The DNA was probes were hybridized to blots of genomic DNA that had been digested with Sau3AI and NlaIII, whose sites are highly digested with various restriction endonucleases (Materials and conserved in the 39 untranslated region (UTR) of rat L1 Methods and refs. 4 and 6). As shown previously and discussed elements (see legend to Fig. 1 and refs. 6 and 19) and define in detail (4, 6), these reactions generally revealed distinctive a 215-bp fragment that was purified by gel electrophoresis and patterns of hybridized bands that greatly enhanced both the ligated to the dephosphorylated BamHI site of pUC19 as specificity and the information content of the hybridizations. described (6). Transfected bacteria were screened for L1- Since the L1 phylogenetic character is the result of a containing clones by hybridization with a fragment of the 39 hybridization reaction, we refer to each L1 character by the UTR at moderate stringency (6). DNA sequencing, blotting of name of the oligonucleotide hybridization probe rather than by restriction endonuclease-digested genomic DNA, and blot the name of the L1 subfamily upon which the oligonucleotide hybridizations with oligonucleotide probes were carried out by was defined. This avoids confusion because an ancestral L1 using standard procedures described in ref. 6 or refs. 20–22. oligonucleotide character can be retained in present-day mam- The 206 sequences that had not been reported previously have mals in two ways. First, old L1 subfamilies are not cleared from been deposited in GenBankyEMBL (accession nos. the genome. Therefore, the oligonucleotide characters defined AJ004354–AJ004559). Generally, these sequences correspond on old L1 subfamilies will be retained until they are no longer to the expected '215 bp of the 39 UTR. However, occasionally detectable by hybridization because of the accumulation of both longer fragments ('320 bp) and shorter fragments random mutations as the old L1 elements ‘‘age’’ as pseudo- (,'150 bp) were sequenced. genes. DNA Sequence