Copyright  2004 by the Genetics Society of America DOI: 10.1534/genetics.103.025916

Duplicative and Conservative Transpositions of Larval serum 1 in the Genus Drosophila

Josefa Gonza´lez, Ferran Casals and Alfredo Ruiz1 Departament de `tica i de Microbiologia, Universitat Auto`noma de Barcelona, 08193 Bellaterra (Barcelona), Spain Manuscript received December 17, 2003 Accepted for publication June 1, 2004

ABSTRACT Interspecific comparative molecular analyses of transposed genes and their flanking regions can help to elucidate the time, direction, and mechanism of gene transposition. In the Drosophila melanogaster genome, three Larval serum protein 1 (Lsp1) genes (␣, ␤ and ␥) are present and each of them is located on a different , suggesting multiple transposition events. We have characterized the molecular organization of Lsp1 genes in D. buzzatii, a species of the Drosophila subgenus and in D. pseudoobscura,a species of the Sophophora subgenus. Our results show that only two Lsp1 genes (␤ and ␥) exist in these two species. The same chromosomal localization and genomic organization, different from that of D. melanogaster, is found in both species for the Lsp1␤ and Lsp1␥ genes. Overall, at least two duplicative and two conservative transpositions are necessary to explain the present chromosomal distribution of Lsp1 genes in the three Drosophila species. Clear evidence for implication of snRNA genes in the transposition of Lsp1␤ in Drosophila has been found. We suggest that an ectopic exchange between highly similar snRNA sequences was responsible for the transposition of this gene. We have also identified the putative cis-acting regulatory regions of these genes, which seemingly transposed along with the coding sequences.

EQUENCE analysis of genomes has revealed that that numerous local rearrangements, including transpo- S gene transposition contributes significantly to the sitions of single genes to different , have reorganization of eukaryotic genomes. Gene transposi- occurred (Bennetzen and Ma 2003). Transposition has tion refers to the movement of relatively small genomic also played a significant role in the evolution of the segments, containing one or a few genes, from one mammalian genome. Segmental duplications origi- chromosomal position to another. This movement may nated from the duplicative transposition of small por- of the %5ف be accompanied or not by the duplication of the geno- tions of chromosomal material represent mic segment, two processes that may be denoted as (Eichler 2001; Lander et al. 2001; Bai- duplicative and conservative transposition, respectively. ley et al. 2002) and at least 1.2% of the mouse genome In nematodes, gene transposition seems to be the most (Cheung et al. 2003). Some of the duplicated segments frequent kind of genome rearrangement (Coghlan in the human genome are associated with rapid gene and Wolfe 2002) whereas duplications of chromosomal innovation and chromosomal rearrangement in the ge- segments encompassing a few genes followed by differ- nomes of man and the great apes (Samonte and ential gene loss is a common cause of gene order Eichler 2001; Armengol et al. 2003; Locke et al. 2003). changes in yeasts (Llorente et al. 2000; Fischer et al. In Drosophila, segmental duplications seem to be rare 2001). In plants, repeated rounds of large-scale genome in comparison to the number found in mammalian duplication followed by selective gene loss are the main genomes (Lander et al. 2001; Celniker et al. 2002). factors in genome evolution. Chromosomal rearrange- Also, detailed analyses by in situ hybridization show that ments were thought to be only a minor factor in the the gene content of chromosomal elements is generally divergence of plant genomes (Ku et al. 2000). However, conserved and suggest that gene transpositions are rela- when more detailed comparisons were performed, many tively scarce (Gonza´lez et al. 2002; Ranz et al. 2003) chromosomal rearrangements were found. For exam- in relation to paracentric inversions, which have been ple, the comparison of genome sequences of rice to traditionally considered as the chief type of chromo- orthologous regions from other grass species revealed somal change (Krimbas and Powell 1992; Powell 1997). However, recent sequence analyses comparing five different Drosophila species point to similar num- Sequence data from this article have been deposited with the bers of inversions and gene transpositions (Bergman et EMBL/GenBank Data Libraries under accession nos. AY561258 and al. 2002) and a number of new genes originated by AY561259. retroposition has been unveiled (Betra´n et al. 2002). 1Corresponding author: Departament de Gene`tica i de Microbiologia, Facultat de Cie`ncies-Edifici C, Universitat Auto`noma de Barcelona, Diverse molecular mechanisms (sometimes poorly 08193 Bellaterra (Barcelona), Spain. E-mail: [email protected] understood) may be responsible for gene transposition

Genetics 168: 253–264 (September 2004) 254 J. Gonza´lez, F. Casals and A. Ruiz

TABLE 1 Chromosomal homologies between D. melanogaster, D. pseudoobscura, and D. buzzatii, the three Drosophila species analyzed in this study (from Powell 1997)

Muller’s element Species Subgenus Species group A B C D E F D. melanogaster Sophophora melanogaster X 2L2R3L3R4 D. pseudoobscura Sophophora obscura XL 4 3 XR 2 5 D. buzzatii Drosophila repleta X354 26 events. A common mechanism for gene transposition scura) and five species of the Drosophila subgenus (in- is retroposition, which implies reverse transcription of cluding D. hydei as a representative species of the repleta RNA and insertion of the resulting cDNA into a different group). In the melanogaster subgroup species they were genome site. In humans, the long interspersed element able to localize the three Lsp1 genes: Lsp1␣ on element (LINE) L1 often associates 3Ј flanking DNA as a read- A, Lsp1␤ on element B, and Lsp1␥ on element D. In all through transcript and carries the non-L1 sequence to the other species, both Lsp1␣ and Lsp1␤ hybridized to a new genomic location, a process termed L1-mediated the same polytene band of Muller’s element E, suggesting transduction (Moran et al. 1999; Lander et al. 2001). a gene exchange between elements. No hybridization Seemingly the LINE machinery can also act in trans to of the Lsp1␥ gene was observed although ␥-like cellular RNA substrates giving rise to the trans-mobiliza- were detected with specific antibodies. To determine tion of genomic DNA, processed pseudogenes, and oc- Lsp1 gene number, Brock and Roberts (1983) per- casionally new functional genes (Esnault et al. 2000; formed Southern analyses and concluded that at least Betra´n et al. 2002; Ejima and Yang 2003; Long et al. two genes, one ␣-like and one ␤-like, were present in 2003). Another mechanism of transposition is transpo- all the species analyzed. Their data also suggested that son-mediated excision and insertion of genomic seg- the ancestor of the genus Drosophila probably had its ments. For instance, in D. melanogaster, Folback elements Lsp1 genes on element E. Recently, we localized by in flanking relatively large genomic segments are able to situ hybridization Lsp1␣ in chromosome 2 (Muller’s ele- transport these segments to sites far away in the genome, ment E) of D. repleta and D. buzzatii (Gonza´lez et al. forming the so-called “giant transposons” (Chia et al. 2002; Ranz et al. 2003), corroborating their results. 1985; Lovering et al. 1991). Excision and insertion of In this work, Lsp1 genes and their flanking sequences these giant transposons is mediated by homologous re- have been cloned and sequenced in D. buzzatii, a species combination involving the Foldback sequences at the belonging to the repleta group of the Drosophila sub- transposon termini. Transposable elements seem to be genus, which diverged from D. melanogaster 40–62 implicated also in the origin of segmental duplications MYA (Beverly and Wilson 1984; Russo et al. 1995). in humans. Duplication junctions have been found to be In addition, the genome sequence of D. pseudoobscura enriched for Alu short interspersed element sequences (available at http://www.hgsc.bcm.tmc.edu) has been with a significant proportion of all segmental duplica- searched and Lsp1 genes have been annotated in this MYA 30ف tions ending within Alu sequences (Bailey et al. 2003). species, which diverged from D. melanogaster This observation suggests Alu-Alu homologous recombi- (Throckmorton 1975) and belongs to the same subge- nation as the most likely mechanism for these re- nus. The aims of this study are: (i) to determine beyond arrangements. A similar mechanism has previously been doubt the number and localization of Lsp1 genes in shown to generate small duplications, deletions, and these two Drosophila species because Southern analyses inversions in diverse organisms. and in situ hybridization results for members of gene Larval serum protein 1 (Lsp1) genes provide one of families may be misleading (Bachtrog and Charles- the few examples of gene transposition in the genus worth 2003); (ii) to ascertain the number and type of Drosophila. In D. melanogaster, each of the three Lsp1 transposition events undergone by Lsp1 genes during genes is located on a different chromosome: Lsp1␣ in the evolution of the genus Drosophila; (iii) to uncover chromosome X, Lsp1␤ in chromosomal arm 2L, and the molecular mechanism of transposition, in particular Lsp1␥ in chromosomal arm 3L (Roberts and Evans- to test the hypothesis of an involvement of transposable Roberts 1979; Smith et al. 1981). These chromosomal elements; and (iv) to identify putative regulatory se- arms correspond to Muller’s elements A, B, and D, re- quences of Lsp1 genes and determine whether these spectively (Table 1). Brock and Roberts (1983) mapped regulatory sequences transposed along with the coding Lsp1 genes by in situ hybridization in nine other species sequences or were recruited ex novo at the new chromo- of the Sophophora subgenus (including D. pseudoob- somal location. Gene Transpositions in Drosophila 255

MATERIALS AND METHODS TABLE 2 Drosophila stocks: Two lines of D. buzzatii were used: line Results of the blast search against the D. pseudoobscura genome j-19 from Ticucho (Argentina) is fixed for chromosomal ar- database as of August 28, 2003, using as queries the rangement 2j and line jq7-4 from Otamendi (Argentina) is coding sequences of Lsp1␣, Lsp1␤, and Lsp1␥ fixed for arrangement 2jq7. The genome sequence of D. pseudo- obscura comes from inbred line MV2-25 (http://www.hgsc. Query Blast hits Score (bits) E-value bcm.tmc.edu). Screening of genomic libraries: Two different lambda geno- Lsp1␤ Contig 815_contig 5737 2044 0.0 mic libraries were screened by plaque hybridization: the j-19 Contig 1500_contig 3546 222 3eϪ56 7 library (Ca´ceres et al. 2001) and the jq -4 library (Casals et Lsp1␥ Contig 1500_contig 3546 1237 0.0 al. 2003). Both libraries were previously amplified as described Contig 815_contig 5737 121 9eϪ26 in Sambrook et al. (1989). DNA from positive phages was Lsp1␣ Contig 815_contig 5737 1269 0.0 digested and the resulting fragments were subcloned into Contig 1500_contig 3546 111 9eϪ23 Bluescript II SK vector (Stratagene, La Jolla, CA) after gel purification. The j-19 library was screened with a 0.7-kb BamHI fragment of the D. melanogaster Lsp1␣ gene (Brock and Rob- erts 1983). Six positive phages were recovered and one of using the ABC-Elite kit from Vector Laboratories (Burlingame, them, ␭j-19/8, was partially sequenced and found to contain CA). Hybridization signals were localized using the cytological the 5Ј region of Lsp1␤ (Figure 1a). Another lambda phage, maps of D. buzzatii (Ruiz and Wasserman 1993) and photo- ␭j-19/25, containing the 3Ј region of this gene had been graphs were taken with a phase contrast Nikon Optiphot-2 previously isolated in our laboratory (Casals et al. 2003). None microscope at ϫ600 magnification. of the six positive phages contained Lsp1␥. To clone this gene, the j-19 library was screened with a 0.6-kb HindIII-SalI frag- ment of D. buzzatii Lsp1␤ (Figure 1a). Nine positive phages were identified but again none of them included Lsp1␥. To RESULTS clone D. buzzatii Lsp1␥, a different library (jq7-4) was screened with the same fragment of D. melanogaster Lsp1␣ previously Lsp1 gene number: Screening of two genomic libraries used to screen the j-19 library. Five positive phages were identi- of D. buzzatii with different probes allowed us to isolate fied and one of them, ␭jq7-4/12, contained the 3Ј end of the clones containing the Lsp1␤ and Lsp1␥ genes (see mate- ␥ Lsp1 gene. To clone the other end of the gene, a 0.75-kb rials and methods). To determine the number of Lsp1 SalI-ClaI fragment of D. buzzatii Lsp1␥ was used to screen the same library (Figure 2a). Seventeen positive phages were genes present in the D. buzzatii genome, a Southern recovered and one of them, ␭jq7-4/27, included almost the analysis was performed (see Figure S1 available as online entire Lsp1␥ gene. supplementary material at http://www.genetics.org/ Southern analysis: Southern hybridization was performed supplemental/). Genomic DNA from D. buzzatii and D. as described in Sambrook et al. (1989). Probes were labeled by melanogaster (as control) was digested with EcoRI and random primer with digoxygenin-11-dUTP and hybridization HindIII and hybridized with the same three probes used was carried out overnight in standard buffer with 50% for- mamide at 42Њ for homologous probes and at 37Њ for heterolo- to screen the D. buzzatii libraries (see materials and gous probes. Stringency washes were performed with 2ϫ SSC methods). The D. melanogaster Lsp1␣ probe hybridized 0.1% SDS and 0.1ϫ SSC 0.1% SDS solutions at 68Њ and 50Њ to 6- and 4.1-kb fragments of D. melanogaster genomic for homologous and heterologous hybridizations, respectively. DNA corresponding to Lsp1␣ and Lsp1␤, respectively DNA sequencing and sequence analysis: Sequences were (Brock and Roberts 1983). In D. buzzatii this probe obtained with an ABI 373 A (Perkin Elmer, Norwalk, CT) automated DNA sequencer using M13 universal forward and hybridized to a single fragment of 3.1 kb corresponding reverse primers. A few internal primers were also designed to Lsp1␤. The D. buzzatii Lsp1␤ probe hybridized to when necessary. Nucleotide sequences were assembled using 6- and 4.1-kb fragments of D. melanogaster corresponding GeneToolLite software (BioTools). Similarity searches in the to Lsp1␣ and Lsp1␤, respectively (Brock and Roberts GenBank/EMBL and in the Drosophila pseudoobscura genome 1983), and a 3.1-kb fragment of D. buzzatii genomic project (available at http://www.hgsc.bcm.tmc.edu) databases DNA corresponding to Lsp1␤. Finally, the D. buzzatii were performed using blastn and fasta3 programs. Multiple ␥ sequence alignments were obtained with ClustalW (Thomp- Lsp1 probe hybridized to the 4.1-kb fragment corre- son et al. 1994) and DiAlign (Morgenstern 1999). PAML sponding to Lsp1␤ and to 1.8- and 1.4-kb fragments software (Yang 1997) was used to estimate the number of both corresponding to Lsp1␥ of D. melanogaster (Brock synonymous (dS) and nonsynonymous (dN) substitutions per and Roberts 1983) and also with a 3.1-kb fragment site. This software avoids using reconstructed ancestral se- corresponding to Lsp1␤ and a 1-kb fragment corre- quences to estimate dS and dN for lineages in a phylogeny by ␥ using a maximum likelihood approach. Different codon-based sponding to Lsp1 of D. buzzatii. Overall, the results of ␤ likelihood models that allow for different dN/dS ratios among Southern analyses indicated that only two genes, Lsp1 evolutionary lineages can be devised. The models can then and Lsp1␥, are present in the D. buzzatii genome. No be compared to test the neutral prediction that the dN/dS ratio Lsp1␣ gene was detected by Southern analysis in the D. is identical among lineages (Yang and Bielawski 2000). buzzatii genome. In situ hybridization: In situ hybridization of DNA probes was carried out as described in Montgomery et al. (1987). Similarity searches against the D. pseudoobscura ge- Hybridization temperature was 37Њ. Probes were labeled with nome sequence database were carried out (Table 2). biotin-16-dUTP by nick translation and detection was done Two sequences with significant similarity were found 256 J. Gonza´lez, F. Casals and A. Ruiz

Figure 1.—Genome organization of the Lsp1␤ region in D. buzzatii, D. pseudoobscura, and D. melanogaster. Arrows indicate the start point and direction of transcription. (a) Annotation and chromosomal localization of the 9464 bp sequenced in D. buzzatii. Restriction map includes EcoRI (E), HindIII (H), and SalI (S) sites. The ␭-phages used to sequence this region are represented as lines: solid lines are sequenced regions and broken lines are regions cloned but not sequenced. The short thick segment represents the probe used to screen the D. buzzatii libraries. (b) Annotation of the homologous region in D. pseudoobscura. (c) Fragment of the AE003745 genomic clone of D. melanogaster where those genes flanking Lsp1␤ in D. buzzatii and D. pseudoobscura are located. The chromosomal localization of these genes is also given. Note that in this species Lsp1␤ is not present between CG12492 and Pellino. (d) Genomic organization and chromosomal localization of the Lsp1␤ region in D. melanogaster. This region is included in the genomic clone AE003588 of this species.

with Lsp1␤. The most similar sequence, that included entire D. buzzatii Lsp1␤ gene and its flanking regions in contig 815_contig 5737, was considered to correspond were sequenced (accession no. AY561258). In this spe- to the D. pseudoobscura Lsp1␤ ortholog. When the data- cies, Lsp1␤ is flanked by CG12492 and Pellino (Figure base was searched with the coding sequence of Lsp1␥ 1a). Between Lsp1␤ and Pellino there is a snRNA:U1 as query the same two hits were recovered. This time the gene, orthologous to snRNA:U1:95Ca of D. melanogaster. most similar sequence included in contig 1500_contig In D. pseudoobscura, the Lsp1␤ gene is included within 3546 was considered to correspond to the D. pseudoob- contig 815_contig 5737, which has been putatively as- scura ortholog of Lsp1␥. No new sequences were identi- signed to chromosome XR, homologous to D. melanogas- fied when D. melanogaster Lsp1␣ sequence was used as ter chromosomal arm 3L (Muller’s element D; Table 1). query, the only significant hits being those previously iden- As in D. buzzatii, Lsp1␤ is flanked by CG12492 and Pellino tified as Lsp1␤ and Lsp1␥. We conclude that only Lsp1␤ but in D. pseudoobscura there are three snRNA:U1 genes and Lsp1␥ are present in the D. pseudoobscura genome. between Lsp1␤ and Pellino (Figure 1b). In D. melanogas- Organization of the Lsp1␤ and Lsp1␥ genomic regions ter, CG12492, snRNA:U1:95Ca, and Pellino are localized in D. buzzatii and D. pseudoobscura: In D. buzzatii, the in chromosomal arm 3R (Figure 1c) while Lsp1␤ is lo- Lsp1␤ gene was localized by in situ hybridization, using cated at 21D-E in chromosomal arm 2L (Figure 1d). the ␭j-19/25 phage as probe to polytene band D3c of The alignments of D. pseudoobscura and D. melanogaster chromosome 2 (see Figure S2 at http://www.genetics. genome sequences, available at http://pipeline.lbl.gov/ org/supplemental/). This chromosome is homologous pseudo/, show that Lsp1␤ and its flanking genes are in to chromosomal arm 3R of D. melanogaster (Muller’s fact included in contig 7891_contig 7492 assigned to element E; Table 1). Overall, 9464 bp including the chromosome 2 (Muller’s element E; Table 1). We con- Gene Transpositions in Drosophila 257

Figure 2.—Genome organization of the Lsp1␥ region in D. buzzatii, D. pseudoobscura, and D. melanogaster (for symbols see legend for Figure 1). (a) Annotation and chromosomal localization of the 10,817 bp sequenced in D. buzzatii. Restriction map includes EcoRI (E), HindIII (H), SalI (S), and ClaI (C) sites. (b) Annotation of the homologous region in D. pseudoobscura. Note that Lsp1␥ is nested within Sema-1a gene. (c) Fragment of the AE003621 genomic clone of D. melanogaster containing Sema-1a gene. Note that Lsp1␥ is not present within the first intron. (d) Fragment of D. melanogaster AE003467 genomic clone that contains Lsp1␥. cluded that this must be the correct localization on the the 5-kb region upstream of the Lsp1␥ coding sequence, basis of the fact that CG12492 and Pellino are found on that are highly conserved in D. melanogaster (86–97% element E in both D. melanogaster and D. buzzatii. nucleotide identity). Another four short sequences, In situ hybridization of the ␭jq7-4/27 phage, con- 41–55 nucleotides long, also highly conserved between taining the entire Lsp1␥ coding sequence, to the poly- the three species (91–98% of nucleotide identity), were tene chromosomes of D. buzzatti allowed us to map this found downstream of Lsp1␥ in D. pseudoobscura and D. gene to band C2g of chromosome 3 (see Figure S2 at buzzatii (Figure 2). The conservation in number and http://www.genetics.org/supplemental/). This chromo- relative position of these 12 highly conserved noncoding some is homologous to chromosomal arm 2L of D. mela- sequences flanking Lsp1␥ in both D. buzzatii and D. nogaster (Muller’s element B; Table 1). In total, 10,817 pseudoobscura led us to conclude that the molecular orga- nucleotides including the entire Lsp1␥ gene and its nization of this gene region is the same in the two flanking regions have been sequenced in D. buzzatii (ac- species; i.e., Lsp1␥ is nested within an intron of Sema-1a cession no. AY561259). Upstream of Lsp1␥, 4.5 kb from in both cases. On the other hand, in D. melanogaster the ATG codon, there is a Leucyl transfer RNA (Leu-tRNA) (Figure 2, c and d), the Sema-1a gene is located in band gene (Figure 2a). 29E1-3 on chromosomal arm 2L (Muller’s element B; In D. pseudoobscura, Lsp1␥ is found within contig Table 1) and inside its first intron there is also an Asp- 1500_contig 3546, which belongs to chromosome 4, ho- tRNA gene (Figure 2c) but not the Lsp1␥ gene, which mologous to chromosomal arm 2L of D. melanogaster is located on chromosomal arm 3L (Muller’s element (Muller’s element B; Table 1). Analysis of the D. pseudo- D; Table 1). obscura genomic sequence revealed that 4.9 kb upstream Molecular structure of Lsp1␤ and Lsp1␥ genes in D. from Lsp1␥ there is an Aspartic acid transfer RNA (Asp- buzzatii and D. pseudoobscura: In the two species, both tRNA). Both Lsp1␥ and Asp-tRNA are nested inside the Lsp1␤ and Lsp1␥ are made up of two exons separated first intron of the gene Sema-1a (Figure 2b). In both D. by a small intron (Table 3). The alignment of D. buzzatii, pseudoobscura and D. buzzatii, we detected eight short D. pseudoobscura, and D. melanogaster nucleotide se- sequences, 28 to 59 nucleotides long, scattered along quences shows a 79.4% nucleotide identity for Lsp1␤ 258 J. Gonza´lez, F. Casals and A. Ruiz

TABLE 3 Molecular structure of Lsp1 genes in D. buzzatii and D. pseudoobscura compared to those of D. melanogaster

D. buzzatii a D. pseudoobscura b D. melanogaster c Gene re- gion Lsp1␤ Lsp1␥ Lsp1␤ Lsp1␣ Lsp1␣ Lsp1␤ Lsp1␥ 5Ј-UTR 89 68 82 88 88 85 82 Exon 1 210 210 210 207 210 210 207 Intron 1 63 57 71 64 67 67 65 Exon 2 2157 2112 2154 2115 2241 2160 2112 3Ј-UTR 108 127 117 139 69 151 105 a Data from this work. b Our annotation of the D. pseudoobscura genome (contig 7891_contig 7492 for Lsp1␤ and contig 1500_contig 3546 for Lsp1␥ of the D. pseudoobscura database). c Genomic clones AE003489 (Lsp1␣), AE003588 (Lsp1␤), and AE003467 (Lsp1␥) from the Berkeley Drosophila Genome Project database. and 75.7% nucleotide identity for Lsp1␥. For both site for pairwise comparisons between the three species genes, the intron is placed in the same precise site in were estimated using maximum likelihood methods ␤ the three species and has a similar length (63–71 nucleo- (Yang 1997). The dN/dS ratios for Lsp1 (0.0226– tides for Lsp1␤ and 57–65 nucleotides for Lsp1␥) but 0.0449) are similar to those for Lsp1␥ (0.0209–0.0512). shows little nucleotide conservation other than the do- Comparisons between Lsp1␣ of D. melanogaster with A T ␤ ␥ nor (G·T· /G·A·G· /C) and acceptor (C·A·G) splice sites Lsp1 (0.0470–0.0545) and Lsp1 (0.0367–0.0386) of (Delaney et al. 1986). the other two species also yielded similar results (Table ␤ ␥ Both Lsp1 and Lsp1 possess a TATA box, which is 5). Overall, the dN/dS ratios were low, suggesting a rela- localized in D. buzzatii and D. pseudoobscura at the same tively high degree of functional constraint of these genes nucleotide position as in D. melanogaster (Ϫ32 to Ϫ26). in the three species analyzed. Sequence similarity extends for several nucleotides on Two different methods, neighbor joining and the un- either side of the TATA box (Figure 3). The 5Ј-UTR of weighted pair group method using arithmetic averages D. buzzatii Lsp1␤ has 89 bp, the first 21 nucleotides being (UPGMA), were used to construct phylogenetic trees identical to those in D. melanogaster and D. pseudoobscura. using PHYLIP software (Felsenstein 1989). In addition The 5Ј-UTR of D. buzzatii Lsp1␥ is 68 bp long and 17 to the seven above-mentioned sequences, we included of the first 21 nucleotides are identical to those in D. in the trees those of arylphorin,anLsp1-like gene of melanogaster and D. pseudoobscura. In both cases, another Calliphora vicina (Naumann and Scheller 1991), and block with significant sequence similarity is found fur- Lsp2 of D. melanogaster (Adams et al. 2000). Alternative ther downstream (Figure 3). A polyadenylation signal models for the evolution of the Lsp genes were then ;bp downstream from the tested using maximum likelihood methods (Yang 1997 100ف AATAAA) is located) stop codon in Lsp1␤ and Lsp1␥.InLsp1␤, but not in Bielawski and Yang 2003). First we tested for the con- Lsp1␥, the sequence around this signal is highly con- stancy of evolution rates by comparing both trees: that served in the three species (Table 4a). Apart from this, produced with the UPGMA method, which assumes a no other conserved sequences have been found in the molecular clock, and that built with the neighbor-join- 3Ј region of Lsp1 genes. ing method, assuming no clock. In both cases a single

We searched for putative regulatory sequences in the dN/dS ratio for all lineages was considered. The differ- 5Ј regions of Lsp1 genes in D. buzzatii and D. pseudoob- ence between the likelihood of both trees was significant scura following the criteria devised by Bergman and (2⌬l ϭ 32.64; 7 d.f.; P Ͻ 0.005) indicating that the Kreitman (2001). Comparison of the 5Ј ends of Lsp1␤ model assuming no clock provides a significantly better led to the identification of three conserved sequences fit to the data (Figure 4). We then tested for homogene- Ϫ Ϫ Ϫ starting at sites 189, 126, and 683 of D. buzzatii ity in the dN/dS ratio between lineages by comparing the (Table 4, b, c, and d). Another three conserved se- model assuming no clock and a single dN/dS ratio with Ϫ Ϫ Ϫ quences starting at sites 379, 181, and 71 of D. a free-ratio model, which assumes an independent dN/dS buzzatii were found in the 5Ј region of Lsp1␥ (Table 4, ratio for each lineage. The result was significant (2⌬l ϭ Ͻ e, f, and g). 83.18; 14 d.f.; P 0.0001) indicating that the dN/dS Molecular evolution of the Lsp1 genes: The coding ratios are heterogeneous among lineages. Finally we ␤ ␥ sequence of Lsp1 and Lsp1 from D. buzzatii, D. melano- compared a model assuming several dN/dS ratios (one gaster, and D. pseudoobscura and that of Lsp1␣ from D. background ratio and one ratio for each of the lineages melanogaster were aligned using ClustalW. The number leading to the Lsp1 genes in D. melanogaster) with the of synonymous and nonsynonymous substitutions per single-ratio model. Again the difference was statistically Gene Transpositions in Drosophila 259

Figure 3.—ClustalW alignment of 5Ј-UTR and close upstream sequences of Lsp1␣ in D. melanogaster (Dm) and Lsp1␤ and Lsp1␥ in D. melanogaster, D. pseudo- obscura (Dp), and D. buzzatii (Db). Conserved regions are included in rectangles. As- terisks show nucleotides identical in the seven se- quences and boldface type indicates the most common nucleotide in each position. TATA box is underlined; ϩ1 shows the first nucleo- tide of the 5Ј-UTR. The first AUG codon is in boldface type and is indicated by dots below the sequence.

significant (2⌬l ϭ 21.86; 3 d.f.; P Ͻ 0.005) indicating a agreement with a recent duplicative transposition onto better fit to the data for the model assuming several this chromosome (Roberts and Evans-Roberts 1979). dN/dS ratios. Brock and Roberts (1983) reported that in eight species of the Drosophila and Sophophora subgenera, Lsp1␣ and Lsp1␤ map to the same polytene band in DISCUSSION chromosomal element E except in D. pseudoobscura, where they observed an extra signal in element B (see Lsp1 gene number: In D. melanogaster the LSP-1 pro- Table 1). They concluded that there were at least two tein is made up of three subunits encoded by three Lsp1 genes, one ␣-like and one ␤-like, given that Lsp1␥ could ␣ ␤ ␥ genes: Lsp1 , Lsp1 and Lsp1 (Roberts and Evans- not be localized by in situ hybridization. Contrasting Roberts 1979). However, in D. buzzatii, a species of the results have been obtained in this work. We have unam- Drosophila subgenus, and in D. pseudoobscura, a species biguously localized Lsp1␤ to chromosomal element E of the Sophophora subgenus, only two Lsp1 genes seem and Lsp1␥ gene to element B in both D. buzzatii and ␤ ␥ to be present: Lsp1 and Lsp1 . Two different genomic D. pseudoobscura. According to our results the extra sig- libraries of D. buzzatii have been screened using as nal in D. pseudoobscura element B likely corresponds probes a fragment of D. melanogaster Lsp1␣ and a frag- to Lsp1␥. A plausible explanation for their results is ment of D. buzzatii Lsp1␤ or Lsp1␥. Overall, four differ- that Lsp1␣ and Lsp1␤ probes were cross-hybridizing to ent library screenings have been carried out and in every Lsp1␤ as is suggested by the fact that the signals obtained case all positive clones contained either Lsp1␤ or Lsp1␥. with the Lsp1␤ probe were stronger than those obtained No Lsp1␣ gene was found. To corroborate this result, with the Lsp1␣ probes (Brock and Roberts 1983). Like- genomic DNA was digested with restriction enzymes and wise, our previous in situ hybridization of D. melanogaster hybridized with the same three probes used to screen Lsp1␣ to D. buzzatii (and D. repleta) chromosomes (Gon- the libraries and again the results were in agreement za´lez et al. 2002; Ranz et al. 2003) must be reinterpreted with the existence of only two genes, Lsp1␤ and Lsp1␥, as due to cross-hybridization with Lsp1␤ (see Figure S2 in the D. buzzatii genome. Also, the D. pseudoobscura ge- at http://www.genetics.org/supplemental/). nome has been recently sequenced to approximately sev- Larval serum proteins belong to the hemocyanin su- enfold coverage and is available at http://www.hgsc. perfamily and are thought to act as storage proteins bcm.tmc.edu. Similarity searches against this database that provide amino acids and energy during nonfeeding have been performed, allowing us to annotate Lsp1␤ and periods of immature or adult development (Burmester Lsp1␥ but not Lsp1␣. Thus, our results indicate that et al. 1998). In D. melanogaster LSP-1 is a heterohexamer Lsp1␣ is not present in D. buzzatii or D. pseudoobscura. of randomly associated subunits encoded by Lsp1␣, The most parsimonious explanation for this observation Lsp1␤, and Lsp1␥ genes. The lack of one subunit in D. is that the duplicative transposition that gave rise to buzzatii and D. pseudoobscura does not imply that the Lsp1␣ took place in the lineage leading to D. melanogaster LSP-1 protein will not be functional. In fact, it has been reported that an inbred stock of D. melanogaster lacking 30ف after the divergence of the D. pseudoobscura lineage MYA (but see below). The fact that D. melanogaster Lsp1␣ the ␥-chain is viable under laboratory conditions, sug- is not dosage compensated although it is X linked is in gesting that a subunit specific function for the LSP-1 260 J. Gonza´lez, F. Casals and A. Ruiz

TABLE 4 length (772–789 aa), 49% of the amino acids are identi- Conserved sequences between D. melanogaster, D. pseudoobscura, cal, and 33% of the amino acid substitutions are conser- and D. buzzatii in the 3؅-UTR (a) and 5؅ region (b, c, and d) vative or semiconservative. However, different subunits of Lsp1␤ andinthe5؅ region (e, f, and g) of Lsp1␥ can accumulate different specific amino acids and this could allow the organism to modulate the availability of Lsp1␤ these amino acids in different developmental processes (Massey et al. 1997). The three subunits of the LSP-1 a. Dm ϩ2638 GCAAAAAGTCTAATAAACTTTCGAAAA ϩ2664 protein of D. melanogaster, as well as similar proteins in ********** **************** other Diptera, are enriched in aromatic amino acids Dp ϩ2618 GCAAAAAGTCAAATAAACTTTCGAAAA ϩ2644 (Burmester et al. 1998). Aromatic residues are thought ********* ***************** to serve as precursors for quinones, which play a role ϩ GCAAAAAGTGAAATAAACTTTCGAAAA ϩ Db 2611 2637 in cuticle hardening during metamorphosis (Burme- b. ␣ Dm Ϫ216 AATAGAAGTCTGGCT--TTGATAAG Ϫ194 ster et al. 1998). The polypeptides coded by Lsp1 and ******** * ******* Lsp1␤ are also enriched in methionine but not those Dp Ϫ191 ACGACAAC-CTGGATGGCTGATAAG Ϫ168 encoded by Lsp1␥. The same pattern is observed in D. * * *** **** * ******* pseudoobscura and D. buzzatii. The role of methionine in Ϫ ATGGCAA--CTGGGCGATTGATAAG Ϫ Db 189 167 Drosophila development is not clear (Massey et al. c. Dm Ϫ130 AGCACCTGAGATACACCC Ϫ113 1997). Gene duplication is considered a major force * *********** ** in gene family expansion and gene innovation. After Dp Ϫ111 ACCACCTGAGATAGACTT Ϫ94 duplication, one copy may become silenced (nonfunc- * ****** **** *** tionalization) or assume a novel function (neofunction- Ϫ AGCACCTGCGATACACTC Ϫ Db 126 109 alization) or both copies may split the multiple functions d. Dm Ϫ411 TATCTACATTTTTGAGGA Ϫ394 of the ancestral gene (subfunctionalization; Lynch and ** * ******** ** Conery 2000). Gene amplification of highly expressed Dp Ϫ512 TAGCAACATTTTTCTGCA Ϫ495 functions often lead to highly conserved paralogs in ****************** microbial genomes (Hooper and Berg 2003). The lat- Ϫ TAGCAACATTTTTCTGCA Ϫ Db 683 666 ter seems to be the case of the Lsp1 genes in Drosophila. Lsp1␥ The protein coded by these genes is accumulated to e. high levels by feeding larvae (Massey et al. 1997) and Dm Ϫ380 AATTAAACCTGAAC--TGATATG Ϫ360 gene amplification has not resulted in different discern- ** ***** ***** ** ible functions. Ϫ AAATAAACTTGAACTTTGCATAA Ϫ Dp 494 472 Genome organization and number of transposition * **** **** ** ** ␤ Db Ϫ379 ATGTAAA-TTGACGATT--ATCG Ϫ360 events: In both D. buzzatii and D. pseudoobscura, Lsp1 is f. flanked by CG12492 and Pellino genes on chromosomal Dm Ϫ180 ACCACCTGAATTGAGGC Ϫ168 element E (Figure 1, a and b). The homologous region ******* ** in D. melanogaster, where CG12492 and Pellino are found, Dp Ϫ299 CGCACCTGACGTGCACT Ϫ178 ** * ** * is also located on element E. This region shows no Db Ϫ181 GCCAGGGTAGTTGAGCC Ϫ165 sequence similarity to any Lsp1 gene whatsoever (Figure g. 1c). In D. melanogaster, Lsp1␤ is located on element B Dm Ϫ74 GTCTGCCGCTGATATGGTGCA Ϫ54 (Figure 1d). In the three species, an snRNA:U1 is closely ****** ************* linked to the Lsp1␤ gene. Given that D. buzzatii and D. Dp Ϫ126 GTCTGCGACTGATATGGTGCA Ϫ106 ***** ************* pseudoobscura, which belong to two different subgenera Db Ϫ71 GTCTGATGCTGATATGGTGCA Ϫ51 separated by 80–124 MYR (Beverly and Wilson 1984; Spicer 1988), show the same genomic organization for Blocks inside the sequences that fulfill the requirements of Lsp1␤, this is likely to be the ancestral organization of Bergman and Kreitman (2001) to identify noncoding con- the genus and therefore Lsp1␤ would have conserva- served sequences are shown in boldface type. Dm, D. melanogas- ter; Dp, D. pseudoobscura; and Db, D. buzzatii. Asterisks denote tively transposed from element E to element B in the nucleotides between two consecutive sequences. Polyadenyla- lineage leading to D. melanogaster. The maximum size tion signal is underlined. of the transposed region is 5.7 kb and the only protein- coding gene included is Lsp1␤. In D. buzzatii and D. pseudoobscura, Lsp1␥ is flanked monomers does not exist (Brock and Roberts 1980). by a tRNA and both genes are located inside the first The polypeptides encoded by Lsp1␤ and Lsp1␥ of D. intron of Sema-1a on element B (Figure 2, a and b). In melanogaster, D. pseudoobscura, and D. buzzatii and by D. melanogaster, Sema-1a is located on element B while Lsp1␣ of D. melanogaster were aligned with the ClustalW Lsp1␥ is located on element D (Figure 2, c and d). Since algorithm (see Figure S3 at http://www.genetics.org/ the genomic organization of Lsp1␥ is the same in D. supplemental/). These seven sequences have a similar buzzatii as in D. pseudoobscura, this is likely to be the Gene Transpositions in Drosophila 261

TABLE 5 Maximum likelihood estimates of the number of synonymous and nonsynonymous substitutions per site in the coding regions of Lsp1 genes for pairwise comparisons among D. melanogaster, D. pseudoobscura, and D. buzzatii

Comparison tkSNdS dN dN/dS Lsp1␤ D. melanogaster-D. buzzatii 1.4840 1.9491 268.1 1708.9 3.1884 0.0720 0.0226 D. melanogaster-D. pseudoobscura 0.4923 1.6360 179.9 1797.1 1.2450 0.0559 0.0449 D. buzzatii-D. pseudoobscura 0.9964 1.8803 286.8 1690.2 1.9216 0.0624 0.0325 Lsp1␥ D. melanogaster-D. buzzatii 1.5423 1.7122 336.8 1640.2 2.5579 0.0944 0.0369 D. melanogaster-D. pseudoobscura 1.1835 1.4705 252.3 1724.7 2.7054 0.0565 0.0209 D. buzzatii-D. pseudoobscura 0.8167 1.4804 261.9 1715.1 1.5395 0.0787 0.0512 Lsp1␣-Lsp1␤ D. melanogaster-D. melanogaster 1.0643 1.5939 247.8 1729.2 2.2380 0.0849 0.0379 D. melanogaster-D. buzzatii 1.2360 1.1993 309.2 1667.8 2.0358 0.1109 0.0545 D. melanogaster- D. pseudoobscura 1.0806 1.6403 269.3 1707.7 2.0375 0.0957 0.0470 Lsp1␣-Lsp1␥ D. melanogaster-D. melanogaster 5.5285 1.4217 316.1 1660.9 10.1718 0.2578 0.0253 D. melanogaster-D. buzzatii 4.4263 1.3910 321.9 1655.1 7.6226 0.2800 0.0367 D. melanogaster-D. pseudoobscura 3.0028 1.2382 247.5 1729.5 6.2989 0.2429 0.0386 t, branch length; k, transition/transversion rate ratio; S, synonymous positions; N, nonsynonymous positions; dN/dS, ratio of nonsynonymous/synonymous substitution rate. ancestral organization of the genus and thus the Lsp1␥ obscura lineage. Overall, at least two duplicative and two gene would have conservatively transposed from ele- conservative transpositions are needed to explain the ment B to element D in the lineage leading to D. melano- present localization of Lsp1 genes (Figure 5). gaster. The maximum size of the transposed region is 5.9 Mechanism of transposition: As stated before, the kb and includes the putative regulatory regions located position of the unique intron of Lsp1␤ and Lsp1␥ and upstream of the gene. Another transposition event the 5Ј putative regulatory sequences are conserved in where the nested organization was the ancestral one the three species analyzed. This allows us to rule out had been previously reported in Drosophila (Neufeld retroposition (Betra´n et al. 2002) as the mechanism of et al. 1991). Given that Lsp1␤ and Lsp1␥ are present in the three species studied in different chromosomes, the duplica- tive transposition originating these two genes must have occurred before the divergence of these species (40–62 MYA). The chromosomal localization of the ancestral Lsp1 gene in the genus Drosophila is unknown. There is no reason to believe that this chromosome was ele- ment E as suggested by Brock and Roberts (1983) because in both D. buzzatii and D. pseudoobscura Lsp1␤ is located on element E and Lsp1␥ is located on element B. Therefore, the localization of the ancestral Lsp1 gene could be any of these two elements. In any case, to explain the current localization of these two genes in D. melanogaster at least two conservative transpositions are needed (see above). Another duplicative transposi- tion gave rise to D. melanogaster Lsp1␣. The absence of this gene in D. pseudoobscura indicates that this transposi- tion likely occurred after the divergence between D. melanogaster and D. pseudoobscura. However, the neigh- bor-joining tree of Lsp1 sequences (Figure 4) suggests Figure 4.—Neighbor-joining tree of Lsp1 genes from D. melanogaster, D. pseudoobscura, and D. buzzatii species. Arylphorin that this transposition occurred before the divergence from C. vicina and Lsp2 from D. melanogaster are also included. between D. melanogaster and D. pseudoobscura. If this were The tree was constructed with an alignment of the coding the case, this gene would have been lost in the D. pseudo- sequence of the genes. 262 J. Gonza´lez, F. Casals and A. Ruiz

ure 1), which would be expected if ectopic recombina- tion actually took place. To our knowledge, snRNA genes have not been previously implicated in the gener- ation of rearrangements. Nevertheless, there is no rea- son why they could not act as substrates for ectopic recombination in a manner similar to that of tRNAs in yeasts or Alu sequences in humans. As a matter of fact, the mechanism for gene transposition in Drosophila could be quite similar to that originating segmental duplications in humans (see Figure 6 in Bailey et al. 2003). The evidence for the implication of tRNA genes in the transposition of Lsp1␥ is weak. The tRNA genes found near Lsp1␥ in D. buzzatii and D. pseudoobscura belong to a different isoacceptor type and no tRNA gene was found near Lsp1␥ in D. melanogaster. In addi- tion these tRNAs lie outside the transposed chromo- Figure 5.—Phylogenetic relationships and karyotypic orga- somal segment. Therefore, an involvement of tRNA nization of D. melanogaster and D. buzzatii. The present chromo- genes in the transposition of Lsp1␥ is unlikely and in somal localization of Lsp1 genes in D. melanogaster and D. this case the mechanism remains uncertain. buzzatii is shown. At least two duplicative and two conservative ؅ transpositions are required to explain the evolution of these 5 noncoding conserved sequences: Transposition im- genes from an ancestral Lsp1 gene. plies the localization of the transposed gene in a new genomic environment. The probability of success of a transposition should be higher when it includes the transposition. The analysis of the flanking regions of regulatory regions than when it does not. To be func- both genes provides no evidence that transposition had tional, in the latter case the transposed gene would need been mediated by transposable elements (TEs). A single to recruit new regulatory regions (Betra´n et al. 2002). TE copy was found in the original position of Lsp1␤ in Comparative sequence analysis allows the identification D. melanogaster (Figure 1c) but there are no indications of conserved DNA sequences in noncoding regions that of its involvement in the transposition. However, taking are considered putative cis-acting regulatory elements into account the divergence time between the species (Bergman and Kreitman 2001). Comparison of the 5Ј analyzed, mobile elements could have played a role in ends of Lsp1␣, Lsp1␤ and Lsp1␥ genes of D. melanogaster the origin of the transposition and then be lost by dele- led to the identification of two such conserved se- tion or excision. The most striking feature of Lsp1 genes quences (Delaney et al. 1986). These sequences are in the three species analyzed is that they are very close also conserved in Lsp1␤ (Table 4, b and c) and Lsp1␥ to snRNA or tRNA genes. Both are repetitive genes that (Table 4, e and f) in the three species analyzed. These are scattered in the genome. However, the probability two regions are more conserved between Lsp1␣ and that Lsp1 genes are close to snRNA or tRNA genes just by Lsp1␤ than between Lsp1␤ and Lsp1␥, which is in agree- chance seems very low. A recent review of D. melanogaster ment with the origin of Lsp1␣ from a duplication of genome sequence-Release 3 has found 290 tRNA genes Lsp1␤ (Smith et al. 1981). Two conserved sequences and only 28 snRNAs in the euchromatin (Misra et al. not previously described in D. melanogaster have been tRNA gene/0.4 Mb and 1 snRNA gene/ also identified. One is exclusive of Lsp1␤, the other is 1ف ,.i.e ,(2002 4.3 Mb of euchromatic DNA. Repeated genes, e.g., tRNA exclusive of Lsp1␥ (Table 4, d and g), and both are and ribosomal protein genes, have been previously im- highly conserved in the three species analyzed. All con- plied in the origin of chromosomal rearrangements in served sequences except that located at Ϫ181 of D. buz- yeasts (Szankasi et al. 1986; Kellis et al. 2003). zatii Lsp1␥ fulfill the requirements used by Bergman We propose that an ectopic exchange between and Kreitman (2001) to identify noncoding conserved snRNA:U1 sequences mediated the Lsp1␤ transposition in blocks. The mean size of these five blocks is 13 bp, Drosophila. Three observations indicate that snRNA:U1 similar to the modal size of 11 bp reported by these genes are implied in this transposition. First, they are authors, and three of them are included in the con- present in the original location of chromosomal ele- served blocks that they have described. They follow the ment E as well as in the destination site of element B. pattern described for cis-regulatory elements in Dro- Second, snRNA:U1 genes present a high level of nucleo- sophila, i.e., highly conserved sequences separated by tide identity among the three species (98%) suggesting unalignable gaps (Bergman and Kreitman 2001). that they might act as substrates for ectopic exchanges. The 5Ј-untranslated regions of the RNA show two Finally, the snRNA:U1 gene represents the downstream highly conserved sequences common to the three Lsp1 boundary of the transposed chromosomal segment (Fig- genes in the three species analyzed (Figure 3). Although Gene Transpositions in Drosophila 263 the significance of these homologies in the 5Ј-UTR is advice and helpful discussion. This work was supported by grant not clear, it has been suggested that the synthesis of BMC2002-01708 from the Direccio´n General de Investigacio´n (Minis- terio de Ciencia y Tecnologı´a, Spain) awarded to A.R. LSP-1 protein may be under translational control (Pow- ell et al. 1984). All the signals necessary for the correct tissue and temporal specificity expression of Lsp1␣ and LITERATURE CITED Lsp1␤ in D. melanogaster lie 1650 and 2250 bp upstream from these two genes, respectively (Jowett 1985; Davies Adams, M. D., S. E. Celniker, R. A. Holt, C. A. Evans, J. D. Gocayne et al., 2000 The genome sequence of Drosophila melanogaster. et al. 1986). In this work we have analyzed more than Science 287: 2185–2195. 2 kb upstream from Lsp1␤ and Lsp1␥ genes and we are Armengol, L., M. A. Pujana, J. Cheung, S. W. Scherer and X. quite confident that the majority if not all the cis-acting Estivill, 2003 Enrichment of segmental duplications in re- gions of breaks of synteny between the human and mouse ge- regulatory sequences have been identified. The fact that nomes suggests their involvement in evolutionary re- the transposition of Lsp1 genes included not only the arrangements. Hum. Mol. Genet. 12: 2201–2208. genes but also the regulatory regions has probably Bachtrog, D., and B. Charlesworth, 2003 On the genomic loca- tion of the exuperantia1 gene in Drosophila miranda: the limits of played a role in the success of these transpositions. in situ hybridization experiments. Genetics 164: 1237–1240. Molecular evolution: Maximum likelihood methods Bailey, J. A., Z. Gu, R. A. Clark, K. Reinert, R. V. Samonte et al., of phylogenetic inference were used to test alternative 2002 Recent segmental duplications in the human genome. Science 297: 1003–1007. models for the evolution of Lsp1 genes (Yang 1997). Bailey, J. A., G. Liu and E. E. Eichler, 2003 An Alu transposition Figure 4 shows the phylogenetic tree that provides a model for the origin and expansion of human segmental duplica- significantly better fit to the data. The tree is in agree- tions. Am. J. Hum. Genet. 73: 823–834. ␣ Bennetzen, J. L., and J. Ma, 2003 The genetic colinearity of rice ment with the appearance of Lsp1 from a duplication and other cereals on the basis of genomic sequence analysis. of Lsp1␤ in the Sophophora subgenus. Curr. Opin. Plant Biol. 6: 128–133. Bergman, C. M., and M. Kreitman, 2001 Analysis of conserved The ratio of synonymous (silent, dS) to nonsynony- noncoding DNA in Drosophila reveals similar constraints in in- mous (amino acid changing, dN) substitution rates is a tergenic and intronic sequences. Genome Res. 11: 1335–1345. measure of selective pressure on a protein: if amino Bergman, C. M., B. D. Pfeiffer, D. E. Rincon-Limas, R. A. Hoskins, ϭ A. Gnirke et al., 2002 Assessing the impact of comparative geno- acid changes are neutral dN/dS 1, if they are mostly Ͻ mic sequence data on the functional annotation of the Drosophila deleterious dN/dS 1, and if they offer a selective advan- genome. Genome Biol. 3: research0086–0086.1–0086.20. Ͼ tage dN/dS 1(Yang and Bielawski 2000; Bielawski Betra´n, E., K. Thornton and M. Long, 2002 Retroposed new genes and Yang 2003). A model assuming free d /d ratios out of the X in Drosophila. Genome Res. 12: 1854–1859. N S Beverley, S. M., and A. C. Wilson, 1984 Molecular evolution in for different lineages provided a significantly better fit Drosophila and higher Dipterans. II. A time scale for fly evolution. to the data than a model with a single ratio for all J. Mol. Evol. 21: 1–12. lineages. This led us to test if a model assuming four Bielawski, J. P., and Z. Yang, 2003 Maximum likelihood methods for detecting adaptive evolution after gene duplication. J. Struct. different dN/dS ratios, one background ratio and one Funct. Genomics 3: 201–212. ratio for each of the three lineages leading to D. melano- Brock, H. W., and D. B. Roberts, 1980 Comparison of the Larval gaster Lsp1 genes, was significantly different from the serum proteins of Drosophila melanogaster using one and two-dimen- sional peptide mapping. Eur. J. Biochem. 106: 129–135. single-ratio model. According to our observations, the Brock, H. W., and D. B. Roberts, 1983 Location of the LSP-1 genes Lsp1 genes have transposed in these three lineages, and in Drosophila species by in situ hybridization. Genetics 103: 75–92. we wanted to test if the change of chromosomal location Burmester, T., H. C. Massey,Jr., S. O. Zakharkin and H. Benes, 1998 The evolution of hexamerins and the phylogeny of insects. has affected the molecular evolution of these genes. J. Mol. Evol. 47: 93–108. Again the results were significant, indicating that the Ca´ceres, M., M. Puig and A. Ruiz, 2001 Molecular characterization model assuming four different d /d ratios fits better to of two natural hotspots in the Drosophila buzzatii genome induced N S by transposon insertions. Genome Res. 11: 1353–1364. the data. The estimates obtained for the dN/dS ratios in Casals, F., M. Caceres and A. Ruiz, 2003 The Foldback-like transpo- this model were 0.0518 for the background ratio and son Galileo is involved in the generation of two different natural chromosomal inversions of Drosophila buzzatii. Mol. Biol. Evol. 20: 0.0769, 0.0598, and 0.0231 for the lineages leading to 674–685. Lsp1␣, Lsp1␤, and Lsp1␥ of D. melanogaster, respectively. Celniker, S. E., D. A. Wheeler, B. Kronmiller, J. W. Carlson, A. These results suggest that changes in the selection re- Halpern et al., 2002 Finishing a whole-genome shotgun: release 3oftheDrosophila melanogaster euchromatin genome sequence. gime (degree of functional constraint) are associated Genome Biol. 3: research0079.1–0079.14. with Lsp1 transpositions. However dN/dS ratios do not Cheung, J., M. D. Wilson, J. Zhang, R. Khaja, J. R. Macdonald et show a consistent pattern, being higher than the back- al., 2003 Recent segmental duplications in the mouse genome. ␣ Genome Biol. 4: R47. ground ratio for Lsp1 and lower than the background Chia, W., S. McGill, R. Karp, D. Gubb and M. Ashburner, 1985 ratio for Lsp1␥. In addition, the background ratio is Spontaneous excision of a large composite transposable element itself heterogeneous, indicating that other factors be- of Drosophila melanogaster. Nature 316: 81–83. Coghlan, A., and K. H. Wolfe, 2002 Fourfold faster rate of genome sides transposition have an effect on the molecular evo- rearrangement in nematodes than in Drosophila. Genome Res. lution of Lsp1 genes. 12: 857–867. Davies, J. A., C. F. Addison,S.J.Delaney,C.Sunkel and D. M. We thank L. Sa´nchez (Centro de Investigaciones Biolo´gicas, Con- Glover, 1986 Expression of the prokaryotic gene for chloram- sejo Superior de Invesigaciones Cientifı´cas, Madrid) for providing the phenicol acetyl transferase in Drosophila under the control of Larval D. melanogaster Lsp1␣ clone and E. Hasson (Universidad de Buenos serum protein 1 gene promoters. J. Mol. Biol. 189: 13–24. Aires) for D. buzzatii stocks. We also thank B. Negre for technical Delaney, S. J., D. F. Smith, A. McClelland, C. Sunkel and D. M. 264 J. Gonza´lez, F. Casals and A. Ruiz

Glover, 1986 Sequence conservation around the 5Ј ends of test for the role of natural selection in the stabilization of transpos- the Larval serum protein 1 genes of Drosophila melanogaster. J. Mol. able element copy number in a population of Drosophila melanogas- Biol. 189: 1–11. ter. Genet. Res. 49: 31–41. Eichler, E. E., 2001 Recent duplication, domain accretion and the Moran, J. V., R. J. DeBerardinis,H.H.Kazazian, Jr., 1999 Exon dynamic mutation of the human genome. Trends Genet. 17: shuffling by L1 retrotransposition. Science 283: 1465–1467. 661–669. Morgenstern, B., 1999 DIALIGN 2: improvement of the segment- Ejima, Y., and L. Yang, 2003 Trans mobilization of genomic DNA to-segment approach to multiple sequence alignment. Bioinfor- as a mechanism for retrotransposon-mediated exon shuffling. matics 15: 211–218. Hum. Mol. Genet. 12: 1321–1328. Naumann, U., and K. Scheller, 1991 Complete cDNA and gene Esnault, C., J. Maestre and T. Heidmann, 2000 Human LINE sequence of the developmentally regulated arylphorin of Calliphora retrotransposons generate processed pseudogenes. Nat. Genet. vicina and its homology to insect hemolymph proteins and arthro- 24: 363–367. pod hemocyanins. Biochem. Biophys. Res. Commun. 177: 963– Felsenstein, J., 1989 PHYLIP: phylogeny inference package (ver- 972. sion 3.2). Cladistics, 164–166. Neufeld, T. P., R. W. Carthew and G. M. Rubin, 1991 Evolution Fischer, G., C. Neuve´glise, P. Durrens, C. Gaillardin and B. of gene position: chromosomal arrangement and sequence com- Dujon, 2001 Evolution of gene order in the genomes of two parison of the Drosophila melanogaster and Drosophila virilis sina related yeast species. Genome Res. 11: 2009–2019. and Rh4 genes. Proc. Natl. Acad. Sci. USA 88: 10203–10207. Gonza´lez, J., J. M. Ranz and A. Ruiz, 2002 Chromosomal elements Powell, D., J. D. Sato, H. W. Brock and D. B. Roberts, 1984 Regu- evolve at different rates in the Drosophila genome. Genetics 161: lation and synthesis of the Larval serum proteins of Drosophila mela- 1137–1154. nogaster. Dev. Biol. 102: 206–215. Hooper, S. D., and O. G. Berg, 2003 On the nature of gene innova- Powell, J. R., 1997 Progress and Prospects in Evolutionary Biology: The Drosophila Model. Oxford University Press, New York. tion: duplication patterns in microbial genomes. Mol. Biol. Evol. Ranz, J. M., J. Gonza´lez, F. Casals and A. Ruiz, 2003 Low occur- 20: 945–954. rence of gene transposition events during the evolution of the Jowett, T., 1985 The regulatory domain of a larval serum protein genus Drosophila. Evolution 57: 1325–1335. gene in Drosophila melanogaster. EMBO J. 4: 3789–3795. Roberts, D. B., and S. Evans-Roberts, 1979 The X-linked alpha- Kellis, M., N. Patterson, M. Endrizzi, B. Birren and E. S. Lander, chain gene of Drosophila LSP-1 does not show dosage compensa- 2003 Sequencing and comparison of yeast species to identify tion. Nature 280: 691–692. genes and regulatory elements. Nature 423: 241–254. Ruiz, A., and M. Wasserman, 1993 Evolutionary cytogenetics of the Krimbas, C. B., and J. R. Powell, 1992 Drosophila Inversion Polymor- Drosophila buzzatii species complex. Heredity 70: 582–596. phism. CRC Press, Boca Raton, FL. Russo, C. A. M., N. Takezaki and M. Nei, 1995 Molecular phylogeny Ku, H., T. Vision, J. Liu and S. D. Tanksley, 2000 Comparing and divergence times of Drosophilid species. Mol. Biol. Evol. 12: sequenced segments of the tomato and Arabidopsis genomes: 391–404. large-scale duplication followed by selective gene loss creates a Sambrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Clon- network of synteny. Proc. Natl. Acad. Sci. USA 97: 9121–9126. ing: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Lander, E. S., L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody et Cold Spring Harbor, NY. al., 2001 Initial sequencing and analysis of the human genome. Samonte, R. V., and E. E. Eichler, 2001 Segmental duplications Nature 409: 860–921. and the evolution of the primate genome. Nat. Rev. Genet. 3: Llorente, B., A. Malpertuy, C. Neuve´glise, J. De Montigny, M. 65–72. Aigle et al., 2000 Genomic exploration of the Hemiascomycet- Smith, D. F., A. McClelland,B.N.White,C.F.Addison and ous yeasts: 18. Comparative analysis of chromosome maps and D. M. Glover, 1981 The molecular cloning of a dispersed set of synteny with Saccharomyces cerevisiae. FEBS Lett. 47: 101–112. developmentally regulated genes which encode the major larval Locke, D. P., N. Archidiacono, D. Misceo, M. F. Cardone, S. serum protein of D. melanogaster. Cell 23: 441–449. Deschamps et al., 2003 Refinement of a chimpanzee pericentric Spicer, G. S., 1988 Molecular evolution among some Drosophila inversion breakpoint to a segmental duplication cluster. Genome species groups as indicated by two-dimensional electrophoresis. Biol. 4: R50. J. Mol. Evol. 27: 250–260. Long, M., E. Betra´n, K. Thornton and W. Wang, 2003 The origin Szankasi, P., C. Gysler, U. Zehntner, U. Leupold, J. Kohli et of new genes: glimpses from the young and old. Nat. Rev. Genet. al., 1986 Mitotic recombination between dispersed but related 4: 865–875. tRNA genes of Schizosaccharomyces pombe generates a reciprocal Lovering, R., N. Harden and M. Ashburner, 1991 The molecular translocation. Mol. Gen. Genet. 202: 394–402. structure of TE146 and its derivatives in Drosophila melanogaster. Thompson, J. D., D. G. Higgins and T. J. Gibson, 1994 CLUSTAL Genetics 128: 357–372. W: improving the sensitivity of progressive multiple sequence Lynch, M., and J. S. Conery, 2000 The evolutionary fate and conse- alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673– quences of duplicate genes. Science 290: 1151–1155. 4680. Massey, H. C., Jr., J. Kejzlarova-Lepesant, R. L. Willis, A. B. Cas- Throckmorton, L. H., 1975 The phylogeny, ecology and geogra- tleberry and H. Benes, 1997 The Drosophila Lsp-1 beta gene. phy of Drosophila, pp. 421–469 in Handbook of Genetics, edited A structural and phylogenetic analysis. Eur. J. Biochem. 245: by R. C. King. Plenum Press, New York. 199–207. Yang, Z., 1997 PAML: a program package for phylogenetic analysis Misra, S., M. A. Crosby, C. J. Mungall, B. B. Matthews, K. S. by maximum likelihood. Comput. Appl. Biosci. 13: 555–556. Campbell et al., 2002 Annotation of the Drosophila melanogaster Yang, Z., and J. P. Bielawski, 2000 Statistical methods for detecting euchromatic genome: a systematic review. Genome Biol. 3: re- molecular adaptation. Trends Ecol. Evol. 15: 496–503. search0083.1–0083.22. Montgomery, E., B. Charlesworth and C. H. Langley, 1987 A Communicating editor: M. Veuille