Copyright  2004 by the Genetics Society of America DOI: 10.1534/genetics.103.025775

Evolution of P Elements in Natural Populations of willistoni and D. sturtevanti

Joana C. Silva1 and Margaret G. Kidwell

Department of Ecology and Evolutionary Biology, The University of Arizona, Tucson, Arizona 85721 Manuscript received December 12, 2003 Accepted for publication June 25, 2004

ABSTRACT To determine how population structure of the host species affects the spread of transposable elements and to assess the strength of selection acting on different structural regions, we sequenced P elements from strains of Drosophila willistoni and Drosophila sturtevanti sampled from across the distributions of these species. Elements from D. sturtevanti exhibited considerable sequence variation, and similarity among them was correlated to geographic distance between collection sites. By contrast, all D. willistoni elements sampled were essentially identical (␲Ͻ0.2%) and exhibited patterns typical of a recent population expansion. While the canonical P elements sampled from D. sturtevanti appear to be long-time residents in that species, a rapid expansion of a very young canonical P-element lineage is suggested in D. willistoni, overcoming barriers such as large geographical distances and moderate levels of population subdivision. Between-species comparisons reveal selective constraints on P-element evolution, as indicated by signifi- cantly different substitution rates in noncoding, silent, and replacement sites. Most remarkably, in addition to replacement sites, selection pressure appears to be strong in the first and third introns and in the 3Ј and 5Ј flanking regions.

ELEMENTS are one of the most thoroughly stud- genus Drosophila and a few closely related genera (for a Pied families of eukaryotic transposable elements review see Clark et al. 2002), as well as from Anopheline (TEs), and much is known about their structure, trans- mosquitoes (Sarkar et al. 2003; Oliveira de Carvalho position mechanisms, and evolutionary history (O’Hare et al. 2004). and Rubin 1983; Engels 1989, 1996; Clark et al. 2002; Only two P-element subfamilies include elements that Rio 2002). Here we address two understudied aspects of are known to be active. One of them, the canonical P-element evolution, namely their transmission within subfamily, owes its name to the inclusion of the canoni- natural populations and the distribution of sequence cal element from D. melanogaster and others that are motifs affecting element fitness among different struc- closely related to it (O’Hare and Rubin 1983). This tural regions. subfamily is characteristic of the two New World species The canonical P element, first isolated from Drosophila groups of the subgenus , the Drosophila wil- ف melanogaster,is 3 kb long and contains four open read- listoni and Drosophila saltans groups (Clark et al. 1995), ing frames (ORFs) that together encode a transposase and is the main focus of this study. The second active (Figure 1; O’Hare and Rubin 1983). In addition, a P element was found in Scaptomyza pallida (Simonelig truncated polypeptide consisting of only the first three and Anxolabe´he`re 1991), and belongs to the M-type ORFs and part of the third intron encodes a repressor of subfamily (Hagemann et al. 1994). transposition (Laski et al. 1986; Robertson and Engels Previous studies revealed the presence of selective 1989; Misra and Rio 1990; Gloor et al. 1993). P-ele- -constraints acting on nonsynonymous sites of P ele ف ment sequences have been grouped into 25 subfami- ments (Witherspoon 1999; Silva and Kidwell 2000). lies according to their level of sequence identity and These results were explained by purifying selection act- host taxa (Hagemann et al. 1994, 1996a,b; Clark and ing at the level of the host to preserve the functional Kidwell 1997; Sarkar et al. 2003; Oliveira de Car- coding sequence of the repressor polypeptide. In addi- valho et al. 2004). The distribution of P elements seems tion, selection was also explained by the advantage that to be mainly restricted to the order Diptera, the vast autonomous elements (those encoding all sequence fac- majority of the elements having been sampled from the tors required for transposition) have over nonautono- mous ones at the time of horizontal transfer, which is considered to be an essential step in the P-element life Sequence data from this article have been deposited with the Gen- cycle (Hartl et al. 1997; Pinsker et al. 2001; Silva et Bank Data Library under accession nos. AY578739–AY578784. 1Corresponding author: The Institute for Genomic Research, 9712 al. 2004). However, those studies were based on either Medical Center Dr., Rockville, MD 20850. E-mail: [email protected] a small number of sequences or a small coding region

Genetics 168: 1323–1335 (November 2004) 1324 J. C. Silva and M. G. Kidwell

Figure 1.—Schematic of the structure of the canoni- cal P. (A) The canonical D. melanogaster P element. The element is flanked by 31-bp inverted terminal repeats (represented by arrows). These are adjacent to 5Ј and 3Ј terminal regions (repre- sented by single lines). The element has four ORFs, num- bered 0–3, which together encode a transposase en- zyme. ORFs are represented by open rectangles, and in- trons by single lines. The first and last positions of each noncoding region are numbered above. The in- verted terminal repeats were excluded from all analyses, as they were the primer- binding regions. (B) Struc- ture of the P elements se- quenced from stains of D. willistoni and D. sturtevanti. Open boxes represent open reading frames, and hori- zontal lines represent in- trons and flanking regions. The location of insertions is marked by a T-shaped sign, and their relative length is indicated by the size of the top of the T. An interrup- tion in the structure of an element represents a large deletion. All the elements sequenced from D. willistoni are complete, except for small indels. of the P element, which limited the power of the analyses acting in both coding and noncoding regions of P ele- and prevented assessment of possible constraints in non- ments within and between species, on the basis of a coding regions. large sample of elements, and (2) to investigate the While several studies have explicitly addressed the spread of P elements among natural populations of two transfer of canonical P elements among species (Clark species that, in contrast with D. melanogaster, are not and Kidwell 1997; Silva and Kidwell 2000; Loreto commensal with humans. et al. 2001), the spread of these elements within species has been studied only in D. melanogaster. It has been argued that the association of D. melanogaster with hu- MATERIALS AND METHODS man activities has played a major role in the acquisition Strains: P-element sequences were obtained from 35 strains and subsequent rapid spread of P elements throughout of D. willistoni and 9 strains of D. sturtevanti. The origin of worldwide populations of the species in Ͻ200 years these strains is broadly representative of the distributions of (Kidwell 1979; Engels 1992). Therefore, it is question- the two species, which range from Mexico and Florida in the north to Southern Brazil, but sampling was more exhaustive able whether the spread of P elements in this species is in D. willistoni than in D. sturtevanti (Figure 2, Table 1). representative of the rapidity of spread of TEs in host DNA isolation and sequencing: Total genomic DNA was populations. obtained from each strain following standard protocols. P Here we report a study of the molecular evolution of elements were PCR amplified from each strain using oligonu- canonical P elements within two species, D. willistoni cleotide PInvRep (Table 2) under the following conditions: 2 min of initial template denaturation at 95Њ, 20 cycles of 30 and D. sturtevanti, members of the willistoni and saltans sec denaturation at 95Њ, 20 sec of primer annealing at 55Њ, groups, respectively. The main goals of this study were and 3 min of primer extension at 68Њ, with a final extension twofold: (1) to assess the nature of selective constraints for 7 min at 68Њ. PCR amplifications were done using Platinum P-Element Evolution 1325

Nucleotide variation: Estimates of heterozygosity per site were obtained from the average pairwise number of differ- ences between elements, ␲ (Nei and Li 1979), and from ␪, calculated on the basis of the number of polymorphic sites, S (Watterson 1975). These calculations were performed with DnaSP 3 (Rozas and Rozas 1999). Substitution estimates: The number of synonymous substitu- tions per synonymous site, dS, and the number of nonsynony- mous substitutions per nonsynonymous site, dN, was estimated using the method of Nei and Gojobori (1986). Standard deviations for the average dS and dN within and between groups of sequences were calculated as described by Nei and Jin (1989). Divergence in noncoding regions (introns and un- translated regions) was estimated using the Jukes-Cantor method (Jukes and Cantor 1969). Maximum-likelihood esti- ␻ ␻ϭ mates of among D. willistoni elements, where dN/dS, were obtained using the method of Goldman and Yang (1994) implemented in PAML 3.13 (Yang 1997). The model of evolu- tion used assumes one ␻ for all sites, and we tested whether the value of ␻ estimated from the data provided a significantly better fit to the data than when ␻ is fixed and equal to 1. Because the model in which ␻ is constrained is a special case of the more general model where ␻ is free to vary, the difference in likelihood of the two models can be tested for statistical significance using a likelihood-ratio test (LRT), in this case by comparing the LRT statistic to a ␹2 distribution with 1 d.f. (Yang and Nielsen 2002). Phylogenetic analysis: The phylogenetic relationship among P-element sequences was reconstructed by maximum parsi- mony. Tree space was searched using branch-and-bound. Bootstrap analysis consisted of 100 bootstrap replicates using branch-and-bound. Phylogenetic analyses were performed in Figure 2.—Collection sites of D. willistoni and D. sturtevanti PAUP* (Swofford 1999). strains. Numbers correspond to the first column of Table 1, where location, date, and collector are specified for each strain. RESULTS

Taq DNA polymerase High Fidelity (GIBCO BRL, Gaithers- Length polymorphism in D. willistoni P elements: Am- ϫ 10Ϫ6) and standard concentra- plification of P elements from all 35 collected strains of 0.8ف burg, MD; error rate kb long. Most 3ف tions of all reagents. Few rounds of amplification were used D. willistoni produced a fragment so as to minimize potential polymerase errors in the PCR strains also yielded smaller-sized fragments. All 3-kb P products. The expected number of errors in a 3-kb fragment in 20 rounds of amplification is 3000 ϫ 20 ϫ 0.8 ϫ 10Ϫ6,or elements sampled are identical in structure and se- 0.048/fragment. In the whole data set of 46 sequences, there quence to the D. melanogaster canonical element (Figure will be on average approximately two false polymorphisms. 1 and Table 3), except for rare insertions, deletions, The products of PCR amplifications were cloned into the and point mutations. Most indels are one nucleotide TOPO TA cloning vector (Invitrogen, San Diego). For each (nt) in length and hence disrupt the reading frame strain, one clone was chosen randomly among those with an kb, the length of the canonical element, and when located in a coding region. There are four longer 3ف insert of sequenced. When no inserts of the desired size were present, deletions, which all occur in ORFs, one of which has a as happened in a few strains of Drosophila sturtevanti, inserts length that is not a multiple of three. There are fewer of other sizes were chosen. P-element inserts were sequenced indels per nucleotide in coding regions (9 in 2223 nt) directly, using a series of internal primers that allowed se- than in noncoding regions (4 in 655 nt), but the dif- quencing of each element in both directions (Table 2). Se- ϭ Ӷ quences were obtained by the Laboratory for Molecular Sys- ference is not statistically significant (G 0.442 ␹ 2 ϭ tematics and Evolution at the University of Arizona, using an 0.05[1] 3.84; indels of length multiple of three, which ABI 377 automated sequencer. Several P-element sequences do not disrupt the reading frame, were excluded, mak- were obtained from the literature: the canonical P element ing the test conservative). (O’Hare and Rubin 1983), the functional P element from S. Nucleotide polymorphism in D. willistoni P elements: pallida (Simonelig and Anxolabe´he`re 1991), the canonical elements from Drosophila mediopunctata (Loreto et al. 2001) There are 78 polymorphic sites among the 35 P elements and Drosophila nebulosa (Lansman et al. 1987), and partial P examined (Table 3). Of these, 75 are singletons, two sequences obtained by Clark et al. (1995) from D. sturtevanti, are doubletons, and one is present in three sequences. D. willistoni, and representatives of noncanonical P subfamilies. The 78 polymorphisms can be classified as follows: 17 Sequence analyses: Alignment of all sequences was done by eye using MacClade 4 (Maddison and Maddison 2001). The are in noncoding regions, 23 are synonymous, 36 are location of all insertions, deletions, and point mutations was replacements, and 2 lead to termination codons, one identified using SITES (Hey and Wakeley 1997). of which is the mutation present in three sequences 1326 J. C. Silva and M. G. Kidwell

TABLE 1 Strains of D. willistoni and D. sturtevanti: collection date and location

Key Strains Location Date Collectora,b D. willistoni Florida 1 Florida2 Royal Palm Park, Florida DRC-0811.2 1 Florida6 Fairchild Gardens, Florida DRC-0811.6 Antilles 2 LAntilles1 Toro Negro, Puerto Rico 1994 HH 3 LAntilles3 Grand Etang, Crator Lake, Grenada 1997 HH 3 Lantilles4, -6, -7 Vermont, St. Vincent, and the Grenadines 1997 HH Mexico 4 Apaz2, -7, -10, -27, -33 Apazapa´n, Veracruz 1998 JS 5 Atlixco Atlixco, Veracruz DRC-0811.3 Central America 6 El Salvador Instituto Tropical, San Salvador, ES DRC-0811.1 7 Nicaragua Santa Maria de Ostuna, Nicaragua DRC-0811.0 Ecuador 8 JSacha1, -3, -4, -5, -6.3 Jaton Sacha, near Tena 1997 PO N. Brazil 9 Manaus Manaus, Amazo´nia 1986 VV and CR 10 Para4, -8, -11, -12 Bele´m, Para´ 1997 VV and CR S. Brazil: islands 11 Imel Ilha do Mel, Parana´ 1994 VV and CR 12 Ilha2, -4, -6, -8, 10 Ilha dos Rato˜es, Santa Catarina 1997 VV and CR S. Brazil: mainland 13 Itapua1, -9, -13, -15, -16 Itapua˜ Park, Rio Grande do Sul 1997 VV and CR

D. sturtevanti Antilles 14 DominicanRep1, -4 Montecristi, Dominican Republic DRC-0871.11 15 Jamaica Jamaica DRC-0871.04 Mexico 4 A10S(4,10), A16S Apazapan, Veracruz 1998 JS 16 Matlapa2 Matlapa, San Luis Potosı´ 1998 JS Central America 6 El Salvador Instituto Tropical, San Salvador, El Salvador DRC-0871.5 17 Panama12 Panama 1999 TM Brazil 18 Ceara Pacatuba, Ceara´ 198? FMS (CC) 19 I27-Minas Gerais Serra do Cipo, Minas Gerais 1995 CV (CC) a Origin of stocks: DRC, National Drosophila Species Resource Center; CC, Cla´udia Carareto, Universidade Estadual Paulista; CR, Cla´udia Rhode, Universidade Federal do Rio Grande do Sul; CV, Carlos Vilela, Universi- dade de Sa˜o Paulo; FMS, Fa´bio de Melo Sene, Universidade de Sa˜o Paulo; HH, Hope Hollocher, Princeton University; JS, Joana Silva, The University of Arizona (now at The Institute for Genome Research); PO, Patrick O’Grady, The University of Arizona (now at the University of Vermont); TM, Terry Markow, The University of Arizona; VV, Vera Valente, Universidade Federal do Rio Grande do Sul. b The D. willistoni stocks from the Drosophila Resource Center are numbered 14030–0811.X and those of D. sturtevanti 14043–0871.X (where X stands for the number of each specific stock of the species). Only the digits after the dash are listed. We could not obtain the collection dates for these stocks but they are likely to date from the 1960s or 1970s (W. Heed, personal communication).

(position 2625). These polymorphisms are distributed result. The maximum-likelihood estimate of ␻ is 0.634 evenly along the element, as revealed by similar levels (ln L ϭϪ3817.9547). However, this value does not of polymorphism in different structural regions (Table provide a significantly better fit to the data than ␻ϭ 4). Within coding regions, polymorphisms are about 1 (ln L ϭϪ3819.2663), as the LRT statistic is 2 ϫ twice as frequent in synonymous as in nonsynonymous [Ϫ3817.9547–(Ϫ3819.2663)] ϭ 2.623, which is much ␹ 2 ϭ sites, but the difference is not statistically significant smaller than the critical value of [0.05, 1] 3.841. It is ␻ (Table 4). A more precise estimate of , the ratio d N/ unclear whether this lack of significance reflects a true ف ␻ϭ ␻ d S, obtained using maximum likelihood, supports this value of 1(i.e., that 0.6 is just stochastic variance P-Element Evolution 1327

TABLE 2 the D. sturtevanti elements can be grouped as follows: P-element primers used in PCR and sequencing reactions 84 occur in noncoding regions and 199 in coding re- gions. Of the latter, 64 are synonymous and 135 are Primer designationa Primer sequence nonsynonymous polymorphisms. Phylogenetic relationships among P elements: Phylo- Ј PInvRep 5 -CATAAGGTGGTCCCGTCG genetic relationships among D. willistoni and D. sturte- P427U 5Ј-GGGAGTACACAAACAGAG P979U 5Ј-CTGGCTATTGTTCGTGGT vanti P-element sequences are depicted in Figure 3. P980U 5Ј-TGGCTATTGTTCGTGGTC The element from S. pallida and others representing P1447U 5Ј-CGCTCGCAAAACAGAAGGT noncanonical subfamilies were used as outgroups. Due P1890U 5Ј-CAATTCGACCATCCCACTC to their similarity, only a few representative elements P2422U 5Ј-TCTCACGGCGGACTTATTA were chosen from among all D. willistoni sequences and Ј P490L 5 -AACAGGACCTAACGCACAG from among the Mexican strains of D. sturtevanti. Ј P1032L 5 -CGGGTCCATTCGGGTATTA Several interesting patterns emerge from this analysis. P1489L 5Ј-GAGCTAGCGGTGGTATTCG P2000L 5Ј-AAAGTGCAGTGGAGTGGG Our sampling scheme that was designed to detect ca- PE2040L 5Ј-ATAGGCTCCTTTTGCTG nonical elements did indeed produce only canonical P2457L 5Ј-TAAGTCCGCCGTGAGACA elements, with the possible exception of the element Matlapa2 from D. sturtevanti, which contains a large In each primer designation, the number corresponds to the insertion (Figure 1). The elements from D. willistoni and approximate location in the canonical P element (O’Hare and Rubin 1983), and the letters U and L map the primer to D. sturtevanti, as well as other known canonical elements the coding and noncoding strands, respectively. The primer from D. nebulosa, D. carpicorni, and D. mediopunctata, PE2040L is specific to the D. sturtevanti sequence from Mat- form a monophyletic clade, with the exception of Dstur- lapa, which has a unique insertion. tevanti42, which is known to belong to a noncanonical subfamily (Clark et al. 1995), and Dst-Matlapa2, which groups with noncanonical elements. Also, the elements ␻ϭ around the mean of 1), or whether it results from from D. sturtevanti are paraphyletic in relation to those insufficient power of the method to detect significance from D. willistoni and other species and form clades at very low levels of divergence. according to the geographic origin of the strains from Length polymorphism in the D. sturtevanti P elements: which they were obtained. The D. sturtevanti elements The P-element fragments amplified using the terminal from Mexico (with the exception of Matlapa2) group repeat primers in D. sturtevanti strains varied in size with those from Central America. One of the elements between 500 bp and 4 kb. The PCR products amplified from the Dominican Republic forms a clade with the from strains from Mexico (Apazapa´n and Matlapa), El one from Jamaica and, finally, the D. sturtevanti elements ف Salvador, and Panama had two sizes, of 2.7 and 3 kb from Brazil form a monophyletic group. This geo- in length. One random 3-kb clone was sequenced from graphic grouping also coincides with the distribution each strain, except for A10S from which two randomly of the length of PCR products obtained in each line chosen clones were sequenced. The PCR result from and is further supported by the location and size of the Jamaican line showed a single 500-bp product (one indels (Figure 1 and Table 5). Each of the clades identi- was randomly chosen for sequencing) that from the fied in this analysis was treated independently for the .3kb purposes of polymorphism and divergence estimatesف Dominican Republic yielded two fragments of and 650 bp (one of each size was sequenced). In total, Divergence between D. willistoni and S. pallida P ele- 11 P elements were obtained from the nine strains of ments: The D. melanogaster canonical element differs D. sturtevanti (Table 1). Structurally, these elements are from the consensus sequence of the D. willistoni ele- considerably more polymorphic than those obtained ments only at nucleotide position 32 (“A” in D. melano- from D. willistoni (Figure 1). Some have deletions that gaster and “G” in D. willistoni); this extremely low diver- encompass almost the entire length of the element, one gence is explained by the recent horizontal transfer of has a large insertion, and others have smaller deletions a P element between the two species (Daniels et al. of various lengths (Table 5). The presence of multiple 1990). We estimated the divergence of D. willistoni P indels that disrupt the reading frame of the transposase elements from the S. pallida P element 18 (Simonelig makes it unlikely that any of these elements encodes a and Anxolabe´he`re 1991). The latter is a member of functional transposase. the only P-element subfamily, other than the canonical Nucleotide variability in D. sturtevanti P elements: In one, for which functionality has been confirmed in vivo, Silva) %25ف addition to the high degree of sequence length polymor- and differs from the canonical element by phism, the elements from D. sturtevanti also differ mark- and Kidwell 2000). The number of substitutions per edly from one another in nucleotide sequence: there are synonymous and per nonsynonymous site (d S and d N, 283 nucleotide polymorphisms in the 11 D. sturtevanti P respectively) were estimated separately for each P-ele- elements sequenced. In comparison with the canonical ment region, as well as for the coding region as a whole P-element reference sequence, the polymorphisms in (Table 4). Several remarkable results emerge from these 1328 J. C. Silva and M. G. Kidwell

TABLE 3 Variable nucleotide positions in D. willistoni P elements

The subregions in the element are marked on top. The 5Ј and 3Ј ends correspond to the noncoding flanking regions, and i2 and i3 correspond to introns 2 and 3, respectively. The position of each mutation in the data set is indicated and should be read vertically. Asterisks at the bottom of the figure indicate replacement mutations and the letter S indicates nonsense mutations, which lead to stop codons. All other mutations in the coding region are synonymous.

comparisons. First, d S is significantly larger than d N in divergence between the elements from these two species Ͻ all comparisons. Second, d N is larger in ORF3 than in ( 10%). In addition, and as already observed for D. the other three ORFs (significantly so in relation to willistoni P elements, the 5Ј and 3Ј regions flanking the ORFs 0 and 1). Finally, the number of substitutions in transposase, as well as the first and third introns, have noncoding regions is significantly lower than that in tended to evolve more slowly than silent sites (the differ- synonymous sites, with the exception of intron 2. The ence is statistically significant for intron 1), and nonsyn- significance of these observations is addressed in the onymous sites in ORF3 have evolved significantly faster discussion. than those in the other three ORFs. Polymorphism and divergence in D. sturtevanti P ele- Two P elements from the Antilles strains of D. sturte- ments: Five D. sturtevanti elements from Mexico and vanti, Jamaica, and the Dominican Republic, also form Central America were grouped into clade A (Figure 3). a monophyletic group, clade B. They share one small These elements are almost intact structurally, and the insertion in the 3Ј end and one large internal deletion few indels observed are almost all fixed among them that eliminates most of ORF0, -1, and -2, and half of (Table 5). The degree of polymorphism in synonymous ORF3 (Figure 1 and Table 5). These elements bear sites does not differ significantly from that in nonsynony- a strong similarity to the canonical elements from D. mous sites (Table 6). However, when D. sturtevanti ele- willistoni (Table 7). When these elements are compared ments are compared with those from D. willistoni, d S is to the canonical elements from D. willistoni, synonymous Ј significantly larger than d N, despite the relatively small sites evolve faster than nonsynonymous sites and 5 and P-Element Evolution 1329

TABLE 4 D. willistoni P elements: polymorphism and divergence per region

Polymorphism in D. willistoni P elements (n ϭ 35) vs. S. pallida P element ␲␪ Region Length d S (SD) d N (SD) d S (SD) d N (SD) 5Ј end 122 0.0034 0.0123 0.28 (0.046)*** ORF0 291 0.0015 0.0063 0.0029 (0.0019) 0.0010 (0.0008) NS 0.97 (0.233) 0.14 (0.026)*** Intron 1 58 0 0 0.29 (0.049)*** ORF1 666 0.0013 0.0053 0.0015 (0.0011) 0.0013 (0.0007) NS 0.94 (0.142) 0.11 (0.016)*** Intron 2 53 0.0011 0.0046 0.98 (0.164) NS ORF2 726 0.0017 0.0071 0.0037 (0.0020) 0.0011 (0.0006) NS 1.12 (0.176) 0.18 (0.016)*** Intron 3 190 0.0012 0.0051 0.26 (0.043)*** ORF3 570 0.0020 0.0079 0.0027 (0.0019) 0.0011 (0.0007) NS 0.86 (1.50) 0.22 (0.025)*** 3Ј end 170 0.0017 0.0072 0.20 (0.034)*** Total 2846 0.0016 0.0066 0.0026 (0.0010) 0.0012 (0.0003) NS 0.98 (0.084) 0.15 (0.010)***

The values of d N and d S are averages of all pairwise comparisons. Differences between the number of synonymous and replacement substitutions for each exon, for all exons grouped together, and between noncoding regions and overall synonymous substitutions were tested for significance using a t-test. Significance levels are: P Ն 0.05 NS; 0.05 Ͼ P Ն 0.01 *; 0.01 Ͼ P Ն 0.001 **; P Ͻ 0.001 ***.

Ј 3 noncoding regions, but the difference is not signifi- elements, d N is always larger than d S (significantly so cant. within ORF1, ORF2, and for the element as a whole), The two Brazilian elements from D. sturtevanti, from and the first and third introns are the slowest evolving Ceara´ and Minas Gerais (I27), are very similar to each structural regions of the element. Nonsynonymous sites other, as reflected in the low level of polymorphism in in ORF3 evolve faster than those in the other ORFs. clade D (Table 8) and the multiple indels that they share Finally, one element from the Dominican Republic (Figure 1 and Table 5). The degree of polymorphism (DomRep4) and the element from Matlapa each form in nonsynonymous sites does not differ from that in individual lineages (C and E, respectively). Details of synonymous sites. When compared to the D. willistoni their divergence from the D. willistoni elements are pre-

TABLE 5 Insertions and deletions in D. sturtevanti P elements, relative to the canonical element from D. melanogaster

P-element subregions 5Ј end ORF0 Intron 1 ORF1 Intron 2 ORF2 Intron 3 3Ј end Strains (122) (291) (58) (666) (53) (726) (190) ORF3 (570) (170)a Jamaicab 1d 40- d Ϫ49,2d;1i DominRep1b 1d 01i DominRep4 2 d 4, 32 d 1 d; 1 i 1 d 4 i 7 d 59 d 14 d; 1, 8 i 1 i Matlapa2 3, 1 i 12, 12, 4 d; 1 i 1, 1, 50 d 38- d Ϫ697, 1 d 0 14, 4, 6, 1, 1 d; 1, 1, 6 i 3 i 606, 8, 4 i A10S4 c 0 4i 1i 9d 4i 0 0 24i 2d;1i A10S10 c 0 4i 2i 9d 4i 0 0 24i 2d;1i A16S c 0 4i 1i 9d 4i 0 0 24i 2d;1i El Salvador c 0 4i 0 9d 4i 0 0 24i 2d;1i Panama12 c 0 4i 1i 9d 4i 0 0 24,1i 2d;1i Ceara´ d 10,1d;1i 9-d;1i Ϫ14, 2 d; 6, 26, 1 d; 3 i 49, 9, 41 d 0 20, 22, 154, 40 d; 3, 21 d; 1, 1 i 1,1i 1,19i 1,3i I27 d 10,1d;1i 9-d; Ϫ14, 2 d; 6, 26, 1 d; 3 i 49, 9, 41 d 0 20, 22, 154, 40 d; 3, 21 d; 1, 1 i 1,1i 1,19i 1,3i In the body of the table, numbers are the length, in nucleotides, of each event, either insertion (i) or deletion (d). Commas separate multiple events in the same region. Semicolons separate insertions from deletions. Hyphens indicate that an event continues to the next (or from the previous) region. Absent regions are blank. a There is a fixed insertion in all D. sturtevanti P elements relative to the canonical element in the 3Ј untranscribed region. b Large internal deletion starts at position 7 in ORF0 and ends in position 168 of ORF3. c These elements share all indels with the exception of those in intron 1. d These elements share all indels with the exception of one insertion exclusive to Ceara´, in ORF0. 1330 J. C. Silva and M. G. Kidwell

Interspecies P-element comparisons: Our survey re- vealed a striking difference between the two species in the degree of structural and sequence variability of the P elements sampled. While those sampled from D. wil- listoni are extremely similar to one another, with overall Table 4), the elements from D. sturtevanti) %0.16ف ␲ differ considerably from each other in both structure and sequence (Figure 1 and Table 5). In addition, simi- larity among D. sturtevanti P elements is related to the geographic regions from which the strains were col- lected, as evidenced by the groups formed by elements from Central America (represented by Mexico, El Salva- dor, and Panama), from the Antilles (Jamaica and the Dominican Republic), and from Brazil (Ceara´ and Mi- nas Gerais). Each of these three groups is characterized by specific insertions and deletions (Figure 1, Table 5), as well as by shared derived nucleotide polymorphisms, which are reflected in the P-element phylogeny (Figure 3). These data, together with homogeneous PCR band sizes within each group, are consistent with the hypothe- sis that different D. sturtevanti populations carry their own sets of P elements. Comparisons of P elements between species provide strong evidence that sites in all structural regions of the element, except for intron 2, have evolved under purifying selection. These include not only replacement Figure 3.—Phylogenetic relationships of P elements sam- sites in exons but also, somewhat surprisingly, those in pled from D. willistoni and D. sturtevanti, reconstructed using introns 1 and 3 and both the flanking regions upstream maximum parsimony. Elements collected for this study are in and downstream from the transposase gene. However, boldface type; they are prefixed with “Dwill” or “Dst” for D. the distribution of polymorphisms within species pro- willistoni and D. sturtevanti, respectively. All 35 D. willistoni P elements sampled form a monophyletic clade (only two vides no clear evidence of selective constraints in the representatives of these are shown in the shaded polygon short-term evolution of P elements. comprising Dwill sequences); likewise, six D. sturtevanti P ele- ments from Mexico and Central America are monophyletic (two representatives are shown in the shaded polygon com- DISCUSSION prising Dst sequences). The S. pallida P element and the non- canonical elements from Drosophila lusaltans, Drosophila pav- We studied a group of 45 canonical and one nonca- lovskiana, and D. sturtevanti (Dsturtevanti42) were used as nonical P element sampled from strains of D. willistoni outgroups. Encircled letters mark individual clades of D. stur- and D. sturtevanti collected from across the geographic tevanti sequences and are used for reference in the text and distributions of these species (Table 1, Figure 2). The tables. One of 2210 most parsimonious trees is shown, with striking differences observed between the two species branch lengths proportional to the number of nucleotide changes and bootstrap support shown on the tree. All elements in the degree of P-element structural and sequence vari- analyzed in this study are part of the canonical clade, with ability, as well as differences in the selection patterns the exception of Matlapa2 from D. sturtevanti. acting on their structural regions, are further discussed below in the context of evolutionary scenarios and mechanisms that could have facilitated them. sented in Table 9. While d N is not significantly lower The residence time of canonical P elements in D. than d S for most exons in the case of the DomRep4 willistoni and D. sturtevanti: We have determined that element, when the coding region is considered as a canonical elements sampled previously from the wil- whole this difference is significant. In the case of the listoni and saltans groups diversified at most three mil- element from Matlapa, substitutions have accumulated lion years ago (Silva and Kidwell 2000). However, significantly faster in nonsynonymous than in synony- the very high degree of sequence similarity among D. mous sites in all but ORF2, in which a long deletion willistoni P elements, which contrasts sharply with the prevents a reliable estimate. Once more the evolution results from D. sturtevanti, suggests that the sampled rate in the 5Ј and 3Ј noncoding regions and in the first D. willistoni canonical elements last shared a common and third introns is lower than that of synonymous sites, ancestor much more recently than the time of diversifi- and ORF3 evolved faster than the other ORFs for which cation of the two species. If our sample is representative a reliable rate could be obtained. of all canonical P elements in D. willistoni, then this P-Element Evolution 1331

TABLE 6 Polymorphism and divergence of D. sturtevanti P elements: clade A

Among D. sturtevanti P elements: clade A (n ϭ 5) vs. D. willistoni P (n ϭ 4)a ␲␪␲ ␲ Region Length (silent) (replacement) d S (SD) d N (SD) 5Ј end 120 0.0033 0.0040 0.094 (0.024) NS ORF0 291 0.0042 0.0051 0.0134 0.0018 0.206 (0.064) 0.013 (0.008)** Intron 1 58 0.0069 0.0083 0.021 (0.006)* ORF1 666 0.0024 0.0029 0 0.0032 0.096 (0.027) 0.030 (0.008)* Intron 2 57 0 0 0.084 (0.022) NS ORF2 726 0.0022 0.0026 0.0025 0.0021 0.101 (0.026) 0.032 (0.007)* Intron 3 190 0.0021 0.0025 0.056 (0.014) NS ORF3 570 0.0028 0.0034 0 0.0035 0.183 (0.044) 0.060 (0.012)** 3Ј end 171 0.0095 0.0114 0.070 (0.018) NS Total 2833 0.0037 0.0045 0.130 (0.017) 0.036 (0.005)*** a Same as footnote in Table 4.

species might have been one of the last species within elements, that of D. willistoni elements is characterized the two New World Sophophora groups to be invaded by the almost exclusive presence of singleton mutations, by canonical P elements. The P elements sampled from which includes both length and nucleotide polymor- D. sturtevanti are paraphyletic in relation to the other phisms. Because our sequences were obtained from canonical elements, including those in D. willistoni (Fig- cloned P elements, some of the polymorphisms could ure 3). Together with the high sequence and structural represent polymerase errors. The expected number of heterogeneity of D. sturtevanti elements, this suggests an false polymorphisms in our total data set caused by such older invasion of D. sturtevanti by canonical P elements, errors is approximately two (see materials and meth- relative to that observed for D. willistoni, and their subse- ods). However, we found over 70 polymorphisms, clearly quence divergence by mutation and drift. However, the above the expected background error level, and the high similarity among D. sturtevanti elements of clade effect (if any) of polymerase errors on the results should A, which are present in lines from Mexico and Central be negligible. Furthermore, the fact that the excess of America, and the much lower degree of polymorphism singletons is observed in indels, as well as in point muta- among these elements than in host markers (Silva tions, suggests that the pattern is real. The predomi- 2000), provide strong evidence for the activity of D. nance of singleton substitutions may indicate either a sturtevanti elements that postdates the origin of this spe- rapid expansion of P elements in the host species or cies. Whether a more exhaustive sampling of other sub- the presence of weakly deleterious mutations. The ex- populations of D. sturtevanti would show additional local- tremely high degree of similarity among P elements ized spread of canonical elements within this species sampled from locales that are separated by thousands remains to be seen. If this were the case, it would suggest of kilometers, together with a lack of evidence for strong that the P-element family retains many pockets of trans- selective constraints within species, suggests that canoni- position activity, which may serve as the source of new cal P elements invaded D. willistoni relatively recently waves of horizontal transfer. and spread quickly throughout the whole species. ف Spread of P elements within species: In sharp contrast D. willistoni populations, with an estimated Fst 0.15 to the polymorphism observed among D. sturtevanti P for both nuclear (Adh) and mitochondrial (ND5) mark-

TABLE 7 Polymorphism and divergence of D. sturtevanti P elements: clade B

Among D. sturtevanti P elements: clade B (n ϭ 2) vs. D. willistoni P (n ϭ 4)a ␲␲ ␲ Region Length (silent) (replacement) d S (SD) d N (SD) 5Ј end 119 0.0334 0.058 (0.024) NS ORF0 6 0 0 0 0 0 ORF3 262 0.0229 0 0.0246 0.107 (0.047) 0.036 (0.012) NS 3Ј end 171 0.0336 0.061 (0.023) NS Total 506 0.0277 a Same as footnote in Table 4. 1332 J. C. Silva and M. G. Kidwell

TABLE 8 Polymorphism and divergence of D. sturtevanti P elements: clade D

Among D. sturtevanti P elements: clade D (n ϭ 2) vs. D. willistoni P (n ϭ 4)a ␲␲ ␲ Region Length (silent) (replacement) d S (SD) d N (SD) 5Ј end 109 0 0.080 (0.029) NS ORF 0 282 0.0035 0 0.0045 0.072 (0.036) 0.044 (0.014) NS Intron 1 42 0 0* ORF1 633 0.0034 0.0078 0.0022 0.123 (0.031) 0.046 (0.033)* Intron 2 53 0 0.082 (0.030) NS ORF2 624 0.0016 0 0.0020 0.227 (0.046) 0.030 (0.008)*** Intron 3 190 0.0053 0.051 (0.019)* ORF3 337 0.0030 0 0.0037 0.172 (0.057) 0.084 (0.018) NS 3Ј end 148 0 0.133 (0.048) NS Total 2420 0.0025 0.156 (0.021) 0.047 (0.006)*** a Same as footnote in Table 4.

ers, are much less structured than those of D. sturtevanti, range even when gene flow, as detected from host mark- Ն for which Fst 0.48 for both types of molecular markers ers, is somewhat restricted. (Silva 2000). The marginal populations of D. willistoni, Selection on P elements: In both D. willistoni and D. such as those in the Antilles and Florida, show a signifi- sturtevanti, the presence of deletions and insertions that cant degree of genetic differentiation from the conti- disrupt the transposase reading frame, as well as the nental populations, while in D. sturtevanti significant fact that these indels are evenly distributed among cod- differentiation is observed among continental popula- ing and noncoding regions, suggests a lack of selective tions as well (Silva 2000). This species difference likely constraints on P elements within species. Between spe- results from the higher density, wider ecological range, cies, however, the slower rate of evolution of nonsynony- and concomitant larger effective population size of D. mous sites in all four ORFs relative to that of synony- willistoni relative to that of D. sturtevanti. mous sites suggests that there is selective pressure to Our results suggest that the spread of P elements maintain transposase activity, in agreement with previ- can be hindered by significant subdivision of the host ous results (Witherspoon 1999). This suggests that ele- population, as shown by the association between ele- ments that encode an active transposase have a fitness ment similarity and geographic origin of the D. sturte- advantage over nonautonomous elements. Interest- vanti elements. However, the canonical P elements in ingly, in all comparisons, nonsynonymous sites in ORF3 D. willistoni, including those from peripheral, isolated evolve significantly faster than those in the other three populations, are all nearly identical, showing that these ORFs. As ORF3 is the only ORF not included in the 66- elements can spread rapidly throughout the species kD polypeptide that represses transposition, this result

TABLE 9 Divergence between D. sturtevanti and D. willistoni P elements: lineages C and E

st-DomRep 4 vs. D. willistoni P a st-Matlapa2 vs. D. willistoni P (n ϭ 4)a

Region Length d S (SD) d N (SD) Length d S (SD) d N (SD) 5Ј end 118 0.047 (0.019) NS 120 0.111 (0.045) NS ORF0 255 0.060 (0.035) 0.026 (0.012) NS 252 0.225 (0.073) 0.071 (0.020) * Intron 1 58 0.054 (0.021) NS 58 0.072 (0.039)* ORF1 665 0.051 (0.019) 0.020 (0.006) NS 603 0.161 (0.376) 0.073 (0.013) * Intron 2 53 0.105 (0.042) NS 53 0.177 (0.071) NS ORF2 716 0.124 (0.030) 0.022 (0.006)*** 24 0.087 (0.126) 0.496 (0.219) NS Intron 3 131 0.098 (0.039) NS 131 0.107 (0.043) NS ORF3 555 0.107 (0.033) 0.066 (0.012) NS 542 0.247 (0.055) 0.102 (0.016) * 3Ј end 170 0.036 (0.014) NS 170 0.094 (0.041)* Total 2191 0.089 (0.014) 0.033 (0.004)*** 1422 0.199 (0.028) 0.089 (0.009) *** a Same as footnote in Table 4. P-Element Evolution 1333 provides support for the hypothesis that independent, likelihood value of ␻ is also Ͻ1, but again not signifi- additive constraints act on the transposase and on re- cantly so. These results suggests the possibility that selec- pressors of transposition (Witherspoon 1999). While tion also acts during the transmission of elements within ORFs 0–2 are subjected to selective pressures imposed species, but that the small number of polymorphisms by both types of constraint, ORF3 is subjected only to observed makes it impossible to obtain enough power those imposed on the transposase; therefore it evolves to produce a significant result. Selection at the within- faster than the first three ORFs. species level could occur by two processes. It could result Surprisingly, the rate of evolution of some noncoding from the fact that some of the nonsynonymous polymor- regions, namely the first and third introns and, often, phisms are effectively neutral or only slightly deleterious the 3Ј and 5Ј flanking regions, is of the same order of (Kimura 1983). Slightly deleterious nonsynonymous magnitude as that of nonsynonymous sites and consider- polymorphisms linger in populations for many genera- ably lower than that observed in synonymous sites. Sev- tions, but only rarely go to fixation and hence the ratio eral characteristics of the P-element transposition mech- of synonymous to nonsynonymous mutations increases anism, as well as the maternal transmission of the 66-kD with time, being barely notable in a sample of recently repressor of transposition, are linked to these regions diverged sequences. A similar pattern is observed in the and provide possible explanations for their relatively mitochondrial DNA of many species, in which high degree of conservation. The 3Ј and 5Ј noncoding the ratio of synonymous to nonsynonymous mutations regions contain motifs that are required for transposi- within species is lower than the ratio between species tion, such as the inverted terminal repeats, 11-bp inter- (Nachman 1998). This first process requires that trans- bp in posase acts preferentially in cis, such that the mutated 150ف nal inverted repeats, and unique sequences length at each end of the element (Engels 1989). Re- elements have a lower transposition efficiency; this cent evidence suggests that the third intron is involved seems unlikely in eukaryotes, as the processes of transla- in targeting of the P-element mRNA to the oocyte (Sim- tion and transposition occur in different cellular com- mons et al. 2002). More importantly, this intron is not partments. A second process that could lead to a differ- spliced out from the P-element mRNA transcript in the ence in the rate of accumulation of synonymous and somatic tissue, where it is translated; its presence is es- replacement substitutions would occur when the selec- sential to maintain in the correct frame the early stop tion coefficient of mutations in an element varies with codon that gives rise to the truncated 66-kD repressor the ratio between autonomous and nonautonomous ele- polypeptide (Robertson and Engels 1989; Misra and ments in a cell. This corresponds to selection mecha- Rio 1990; Roche et al. 1995), thus preventing transpo- nism (4) described by Witherspoon (1999) and can sase activity in the soma, which would be clearly deleteri- be explained as follows: during the first stages after ous to the host. Therefore, it is not too surprising that invasion of a new host species, most elements are com- this intron evolves under selection. Because the rate of plete and transposase will be plentiful, acting either in evolution of intron 1 is just as low as that of intron 3, cis or in trans. However, with time, elements accumulate our results strongly suggest that the former contains a mutations and autonomous elements become sparse. If, motif, or motifs, involved in the repression of transposi- at this point, the number of autonomous and of non- tion. autonomous elements per cell varies independently, Selection mechanism: The patterns of divergence de- and if the frequency of transposition is a direct function scribed above support a scenario according to which of the number of autonomous elements per cell, such selection acts on the element only at the time of hori- that transposition is more frequent in cells with the most zontal transfer between species, and not during evolu- functional elements, then the conditions are created tion within species. When invading a new genome, the for selection to favor autonomous over nonautonomous fitness of autonomous elements is higher than that of elements. A larger number of sequences (or more diver- nonautonomous elements because the former contain, gent elements) will have to be collected to determine by definition, all intact sequence motifs required for whether the lack of significance in the values of d S and transposition and can therefore spread in the new host; d N within D. willistoni is real or the result of a lack of during within-species transmission, however, transpo- power due to the small number of mutations. sase could act in trans, benefiting autonomous and non- In conclusion, we have provided evidence consistent autonomous elements alike. Previously, this has been with the idea that P elements are opportunistic elements the hypothesis most frequently put forward to explain that can spread very rapidly throughout a species, de- observations of selective constraint in the evolution of spite the existence of a small amount of population transposases encoded by class II TEs (Witherspoon structure. We have also shown that selective pressures 1999; Silva and Kidwell 2000). differ significantly among structural regions of the ele- Our present results from a large sample of P elements ment. Most remarkably, evolutionary constraints in the from D. willistoni show that d S within species is larger first and third introns are almost as strong as those than d N for all four ORFs, even though the difference observed in nonsynonymous sites. Because the strongest is not statistically significant (Table 4). The maximum- constraints seem to be associated with functions impor- 1334 J. C. Silva and M. G. Kidwell tant for both transposition and its repression, both in cules, pp. 21–132 in Mammalian Protein Metabolism, edited by H. N. Munro. Academic Press, New York. the third intron and in exons, we postulate that the Kidwell, M. G., 1979 Hybrid dysgenesis in Drosophila melanogaster. function(s) related to putative motifs present in the first The relationship between the P-M and I-R interaction systems. intron of the element fall into this category. Finally, our Genet. Res. 33. Kimura, M., 1983 The Neutral Theory of Molecular Evolution. Cam- results show a pattern that could be caused by selection bridge University Press, Cambridge, UK/London/New York. acting not only at the time of horizontal transfer, but Lansman, R. A., R. O. Shade, T. A. Grigliatti and H. W. Brock, also during the spread of P elements within species, 1987 Evolution of P transposable elements: sequences of Dro- sophila nebulosa P elements. Proc. Natl. Acad. Sci. USA 84: 6491– although a more exhaustive sample or more divergent 6495. elements are needed to obtain a conclusive result. Laski, F. A., D. C. Rio and G. M. Rubin, 1986 Tissue specificity of Drosophila P element transposition is regulated at the level of We thank Salvador Baez, Glo´ria Lagunes, and Jaime Pin˜ero for mRNA splicing. Cell 44: 7–19. logistic assistance in the field and to Cla´udia Carareto, Vera Valente, Loreto, E. L., V. L. Valente, A. Zaha, J. C. Silva and M. G. Kidwell, Elgion Loreto, and Fabiana Here´dia for providing the Brazilian strains 2001 Drosophila mediopunctata P elements: a new example of of Drosophila. We also thank Jonathan Clark, Andrew Holyoake, Pat- horizontal transfer. J. Hered. 92: 375–381. rick O’Grady, Michael Nachman, Michael Simmons, and one anony- Maddison, D. R., and W. P. Maddison, 2001 MacClade 4: Analysis of mous reviewer for numerous suggestions that significantly improved Phylogeny and Character Evolution. Sinauer Associates, Sunderland, the focus of this manuscript. This research was funded through fellow- MA. Misra, S., and D. C. Rio, 1990 Cytotype control of Drosophila P ship support from Junta Nacional de Investigaca˜o Cientı´fica e Tecno- element transposition: the 66 kd protein is a repressor of transpo- lo´gica, from the Interdisciplinary Program in Genetics at the Univer- sase activity. Cell 62: 269–284. sity of Arizona, and from the Flinn Foundation to J.C.S. This work Nachman, M. W., 1998 Deleterious mutations in animal mitochon- was also supported by National Science Foundation grants DEB- drial DNA. Genetica 102–103: 61–69. 9701252 and DEB-9815754. Nei, M., and T. Gojobori, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substi- tutions. Mol. Biol. Evol. 3: 418–426. Nei, M., and L. Jin, 1989 Variances of the average numbers of nucleo- LITERATURE CITED tide substitutions within and between populations. Mol. Biol. Evol. 6: 290–300. Clark, J. B., and M. G. Kidwell, 1997 A phylogenetic perspective Nei, M., and W. H. Li, 1979 Mathematical models for studying ge- on P transposable element evolution in Drosophila. Proc. Natl. netic variation in terms of restriction endonucleases. Proc. Natl. Acad. Sci. USA 94: 11428–11433. Acad. Sci. USA 76: 5269–5273. Clark, J. B., T. K. Altheide, M. J. Schlosser and M. G. Kidwell, O’Hare, K., and G. M. Rubin, 1983 Structures of P transposable 1995 Molecular evolution of P transposable elements in the elements and their sites of insertion and excision in the Drosophila genus Drosophila. I. The saltans and willistoni species groups. melanogaster genome. Cell 34: 25–35. Mol. Biol. Evol. 12: 902–913. Oliveira de Carvalho, M., J. C. Silva and E. L. S. Loreto 2004 Clark, J. B., J. C. Silva and M. G. Kidwell, 2002 Evidence of Analyses of P-like transposable element sequences from the ge- horizontal transfer of P transposable elements, pp. 161–171 in nome of Anopheles gambiae. Mol. Biol. 13: 55–63. Horizontal Gene Transfer, edited by M. Syvanen and C. I. Kado. Pinsker, W., E. Haring, S. Hagemann and W. J. Miller, 2001 The Academic Press, San Diego. evolutionary life history of P transposons: from horizontal invad- Daniels, S. B., K. R. Peterson, L. D. Strausbaugh, M. G. Kidwell ers to domesticated neogenes. Chromosoma 110: 148–158. and A. Chovnick, 1990 Evidence for horizontal transmission of Rio, D. C., 2002 P transposable elements in Drosophila melanogaster, the P transposable element between Drosophila species. Genetics pp. 484–518 in Mobile DNA II, edited by N. L. Craig,R.Craigie, 124: 339–355. M. Gellert and A. M. Lambowitz. American Society for Microbi- Engels, W., 1989 P elements in Drosophila melanogaster, pp. 437–483 ology, Washington, DC. in Mobile DNA, edited by D. E. Berg and M. M. Howe. American Robertson, H. M., and W. R. Engels, 1989 Modified P elements Society of Microbiology, Washington, DC. that mimic the P cytotype in Drosophila melanogaster. Genetics 123: Engels, W. R., 1992 The origin of P elements in Drosophila melanogas- 815–824. ter. BioEssays 14: 681–686. Roche, S. E., M. Schiff and D. C. Rio, 1995 P-element repressor Engels, W. R., 1996 P elements in Drosophila, pp. 103–123 in Trans- autoregulation involves germ-line transcriptional repression and posable Elements, edited by H. Saedler and A. Gierl. Springer- reduction of third intron splicing. Genes Dev. 9: 1278–1288. Verlag, Berlin. Rozas, J., and R. Rozas, 1999 DnaSP 3: an integrated program for Gloor, G. B., C. R. Preston, D. M. Johnson-Schlitz, N. A. Nassif, molecular population genetics and molecular evolution analysis. R. W. Phillis et al., 1993 Type I repressors of P element mobility. Bioinformatics 15: 174–175. Genetics 135: 81–95. Sarkar, A., R. Sengupta, J. Krzywinski, X. Wang, C. Roth et al., Goldman, N., and Z. Yang, 1994 A codon-based model of nucleotide 2003 P elements are found in the genomes of nematoceran substitution for protein-coding DNA sequences. Mol. Biol. Evol. of the genus Anopheles. Insect Biochem. Mol. Biol. 33: 11: 725–736. 381–387. Hagemann, S., W. J. Miller and W. Pinsker, 1994 Two distinct P Silva, J. C., 2000 Population genetics of P transposable elements element subfamilies in the genome of Drosophila bifasciata. Mol. and their host species, with emphasis on Drosophila willistoni and Gen. Genet. 244: 168–175. Drosophila sturtevanti. Ph.D. Thesis, University of Arizona, Tucson. Hagemann, S., E. Haring and W. Pinsker, 1996a A new P element Silva, J. C., and M. G. Kidwell, 2000 Horizontal transfer and selec- subfamily from Drosophila tristis, D. ambigua, and D. obscura. Ge- tion in the evolution of P elements. Mol. Biol. Evol. 17: 1542–1557. nome 39: 978–985. Silva, J. C., E. L. Loreto and J. B. Clark, 2004 Factors that affect Hagemann, S., E. Haring and W. Pinsker, 1996b Repeated hori- the horizontal transfer or transposable elements. Curr. Issues zontal transfer of P transposons between Scaptomyza pallida and Mol. Biol. 6: 57–72. Drosophila bifasciata. Genetica 98: 43–51. Simmons, M. J., K. J. Haley and S. J. Thompson, 2002 Maternal Hartl, D. L., A. R. Lohe and E. R. Lozovskaya, 1997 Modern transmission of P element transposase activity in Drosophila melano- thoughts on an ancyent marinere: function, evolution, regulation. gaster depends on the last P intron. Proc. Natl. Acad. Sci. USA Annu. Rev. Genet. 31: 337–358. 99: 9306–9309. Hey, J., and J. Wakeley, 1997 A coalescence estimator of the popula- Simonelig, M., and D. Anxolabe´he`re, 1991 A P element of Scapto- tion recombination rate. Genetics 145: 833–846. myza pallida is active in Drosophila melanogaster. Proc. Natl. Acad. Jukes, T. H., and C. R. Cantor, 1969 Evolution of protein mole- Sci. USA 88: 6102–6106. P-Element Evolution 1335

Swofford, D. L., 1999 PAUP*: Phylogenetic Analysis Using Parsimony Yang, Z., 1997 PAML: a program package for phylogenetic analysis and Other Methods. Sinauer Associates, Sunderland, MA. by maximum likelihood. Comput. Appl. Biosci. 13: 555–556. Watterson, G. A., 1975 On the number of segregating sites in Yang, Z., and R. Nielsen, 2002 Codon-substitution models for de- genetical models without recombination. Theor. Popul. Biol. 7: tecting molecular adaptation at individual sites along specific 256–276. lineages. Mol. Biol. Evol. 19: 908–917. Witherspoon, D. J., 1999 Selective constraints on P-element evolu- tion. Mol. Biol. Evol. 16: 472–478. Communicating editor: M. Simmons