MOLECULAR AND CELLULAR BIOLOGY, Nov. 1992, P. 5102-5110 Vol. 12, No. 11 0270-7306/92/115102-09$02.00/0 Copyright ) 1992, American Society for Microbiology Distinct Families of Site-Specific Occupy Identical Positions in the rRNA Genes ofAnopheles gambiae NORA J. BESANSKY,12* SUSAN M. PASKEWITZ,lt DIANE MILLS HAMM,1 AND FRANK H. COLLINS1'2 Malaria Branch, Division ofParasitic Diseases, National Centerfor Infectious Diseases, Centers for Disease Control, Atlanta, Georgia 30333,1 and Department ofBiology, Emory University, Atlanta, Georgia 303222 Received 15 May 1992/Returned for modification 1 July 1992/Accepted 27 August 1992

Two distinct site-specific families, named RT1 and RT2, from the sibling mosquito species Anopheles gambiae and A. arabiensis, respectively, were previously identified. Both were shown to occupy identical nucleotide positions in the 28S rRNA gene and to be flanked by identical 17-bp target site duplications. Full-length representatives of each have been isolated from a single species, A. gambiae, and the nucleotide sequences have been analyzed. Beyond insertion specificity, RT1 and RT2 share several structural and sequence features which show them to be members of the LINE-like, or non-long-terminal-repeat retrotrans- poson, class of reverse transcriptase-encoding mobile elements. These features include two long overlapping open reading frames (ORFs), poly(A) tails, the absence of long terminal repeats, and heterogeneous 5' truncation of most copies. The first ORF of both elements, particularly ORF1 of RT1, is glutamine rich and contains long tracts of polyglutamine reminiscent of the opa repeat. Near the carboxy ends, three cysteine- histidine motifs occur in ORF1 and one occurs in ORF2. In addition, each ORF2 contains a region of sequence similarity to reverse transcriptases and integrases. Alignments of the protein sequences from RT1 and RT2 reveal 36% identity over the length ofORF1 and 60%o identity over the length ofORF2, but the elements cannot be aligned in the 5' and 3' noncoding regions. Unlike that of RT2, the 5' noncoding region of RT1 contains 3.5 copies of a 500-bp subrepeat, followed by a poly(T) tract and two imperfect 55-bp subrepeats, the second spanning the beginning of ORF1. The pattern of distribution of these elements among five sibling species in the A. gambiae complex is nonuniform. RT1 is present in laboratory and wild A. gambiae, A. arabiensis, and A. melas but has not been detected in A. quadriannulatus or A. merus. RT2 has been detected in all available members of the A. gambiae complex except A. merus. Copy number fluctuates, even among the offspring of individual wild female A. gambiae mosquitoes. These findings reflect a complex evolutionary history balancing gain and loss of copies against the coexistence of two elements competing for a conserved target site in the same species for perhaps millions of years.

Among the transposable elements that encode reverse noma cells (20, 49). However, unlike retroviruses, Li-like transcriptase (RT) are those without long terminal repeats elements are not infectious (43), and the absence of LTRs (LTRs). Examples of this type, sometimes referred to as indicates a very different replication strategy that is not well non-LTR retrotransposons, are the mammalian LINE-1 (L1) understood. Peptide alignments show that the RT and cap- elements and the Drosophila melanogaster I factors, F sid-like domains of non-LTR retrotransposons are more elements, and jockey (6). Comparisons among the many closely related to one another than to those of other RT- elements characterized from a broad phylogenetic spectrum encoding mobile elements (23, 51, 66, 68). However, al- of organisms have revealed several common structural and though the coding region may occupy over 80% of the length sequence features. Full-length elements, typically about 6 kb of a given element, the sequence diversity among character- long, include one or two long open reading frames (ORFs) ized non-LTR retrotransposons is high enough to preclude and poly(A)- or (A)-rich terminal tracts. Most copies, thou- of sands in mammals and tens in other organisms, are hetero- amino acid alignment any but very short segments. geneously truncated at the 5' end. The coding region in- Many non-LTR retrotransposons, such as mammalian Li cludes domains with RT homology and may also contain one elements (32) and Ti elements from mosquitoes in the or more cysteine-histidine (Cys) motifs reminiscent of the Anopheles gambiae complex (8, 9), are widely dispersed in nucleic acid-binding domains of retroviral nucleocapsid pro- the host genomic DNA and have no apparent insertion site teins. Recent reports have demonstrated transposition specificity. However, some elements of this same class through an RNA intermediate for the I factor (37, 56) and the exhibit a preference for a particular target site. In three mouse Li element (25). Indeed, the RT enzymes encoded by different trypanosomatid protozoa, three distinct elements CRE1, jockey, and a human Li element have been shown to (CRE1, SLACS, and CZAR) interrupt some portion of the be functional (30, 33, 50), and the RT activity is associated array of spliced leader RNA genes at the same conserved with virus-like particles of Li RNA and protein in the sequence (2). The Drosophila G element occupies a precise microsomal fraction of human and mouse embryonal carci- site in the intergenic spacers of the rRNA genes (22). Similarly, some 28S rRNA genes are interrupted by non- LTR retrotransposons in most insects (34). Most of the * Corresponding author. rDNA insertions studied belong to the Ri and R2 families of t Present address: Department of Entomology, University of elements that are found at highly conserved sites located 74 Wisconsin, Madison, WI 53706. bp apart and termed Ri and R2 sites, respectively. Excep- 5102 VOL. 12, 1992 DISTINCT RETROTRANSPOSONS WITH IDENTICAL TARGET SITES 5103 tions from the Japanese beetle (34) and fungus gnat (41) S S RTI insert about 30 bp upstream of the R2 site, one from a jPAg23H4S6) R nematode (3) inserts between the two sites, and the mosquito n elements reported here, RT1 and RT2, insert about 630 bp downstream (55). The A. gambiae complex, named for the member species that is the most important vector of malaria in the world, is a group of at least six tropical African mosquito species that are morphologically indistinguishable as adults. They are so closely related that the divergence from a common ancestor of the most anthropophilic members may have accompanied the shift in settlement patterns and land use associated with the origins of agriculture in Africa in the last several thou- sand years (17). Nucleotide sequence comparisons using _____~~~~ (pAgrlO6S3)S regions of the ribosomal and mitochondrial are con- (AAA)- \\\\\\\\\ sistent with this view (11). Previously described non-LTR n ,. . .. retrotransposons from diverse species have been too di- ORF1 verged to align, except for a few relatively short conserved FIG. 1. Structural organization of RT1 and RT2 and their rDNA amino acid sequence motifs typical of RT and putative insertion site relative to those of Rl and R2. Cross-hatched boxes, coding regions; stipled boxes, subrepeat regions; black and open nucleic acid-binding proteins. Sequence comparisons among boxes, coding and transcribed spacer sequences, respectively; S, distinct but less diverged element families from sibling SalI sites used for subcloning. Not drawn to scale. species may help define the importance of less understood regions of the ORFs, particularly the more rapidly evolving gag-like ORF1. Analysis of ORF2 structural domains from distinct site-specific elements may elucidate those residues melting-temperature agarose gel. The insert was excised which dictate target site preference and recognition. Com- from the gel and labeled with [32P]dCTP to 5 x 108 cpm/,ug parison between two elements competing for the same target by random primer extension in the gel overnight (27). site inA. gambiae and other species in the complex may also Southern analysis. DNA was prepared from individual provide an evolutionary picture of their transmission, prop- adult mosquitoes as described previously (15), digested with agation, and loss. With these goals in mind, we isolated from 15 U of SalI (Boehringer Mannheim Biochemicals, Indianap- A. gambiae full-length representatives of two non-LTR olis, Ind.) for 90 min, electrophoresed on an 0.8% agarose retrotransposon families with identical target site specifici- gel, and transferred to GeneScreen Plus membrane (Du ties and determined and analyzed their complete nucleotide Pont, NEN Research Products, Boston, Mass.). Hybridiza- sequences. tion was performed overnight in 0.25% nonfat dry milk (Carnation)-6x SSC (lx SSC is 0.15 M NaCl plus 0.015 M sodium citrate) at 65°C. The blot was washed three times for MATERIALS AND METHODS 15 min each time in prewarmed 0.1x SSC-0.1% sodium Isolation and sequencing of clones. Genomic clones from dodecyl sulfate at 65°C. Activity was recorded both by EMBL3 phage libraries containing RT1 and RT2 insertions autoradiography and by a Betascope 603 Blot Analyzer in the 3' region of the 28S rRNA gene ofA. gambiae and A. (Betagen Corp., Waltham, Mass.) made available by the arabiensis were isolated as described previously (55). Full- Centers for Disease Control Biotechnology Core Facility. length RT1 clone Agr23 fromA. gambiae was obtained from The latter was used in accordance with the manufacturer's the initial screening. A full-length representative of RT2 specifications to record beta emissions directly from the from A. gambiae, Ag106, was selected from the EMBL3 hybridized membranes. library by using the KjpnI-PstI fragment of Aar23, a trun- Northern (RNA) analysis. Total RNA and polyadenylated cated RT2 element from A. arabiensis, as a probe (55). mRNA were isolated from embryos, larvae, pupae, and Overlapping restriction fragments from each bacteriophage adults of anA. gambiae G3 colony by using the guanidinium- clone were subcloned into plasmid Bluescript SKII+ (Strat- cesium chloride method (47) and the FastTrack mRNA agene Cloning Systems, La Jolla, Calif.). isolation kit (Invitrogen), respectively. RNA (10 pg of total Each subclone was sequenced from both strands by the RNA, 1 pg of mRNA) was separated by electrophoresis on dideoxy-chain termination method (58) as modified for dou- a 0.8% agarose gel under denaturing conditions by using the ble-stranded sequencing (62), by using a combination of formaldehyde method (48), transferred to GeneScreen Plus commercially available and synthetic oligonucleotide prim- membrane, and hybridized in accordance with the manufac- ers. The structural organization of both elements and their turer's recommendations. insertion site in the rDNA are shown in Fig. 1. Sequences Nucleotide sequence accession numbers. The sequences were analyzed by using the Computer Group reported here have been given GenBank accession numbers Sequence Analysis Package (21) and a program kindly M93690 (RT1) and M93691 (RT2). provided by W.-H. Li for estimating the rate of synonymous and nonsynonymous nucleotide substitutions between se- RESULTS quences (45), run on the VAX network of the Centers for Disease Control. Nucleotide sequence of an RT1 element. Clone Agr23 from Preparation of probes. RT1 and RT2 probes were derived A. gambiae was previously determined to contain a full- from subclones pAgr23H4S6 (-1.9-kb insert) and length RT1 element on the basis of its hybridization pattern pAgrlO6S3 (-2.8-kb insert), respectively (Fig. 1). Probes toA. gambiae genomic DNA (55) and sequence comparisons were prepared following digestion of subclone DNA to at the 5' ends of other elements of the same length (data not release the insert and electrophoresis through a 1.2% low- shown). The RT1 element in Agr23 is 8,036 nucleotides long 5104 BESANSKY ET AL. MOL. CELL. BIOL.

1 QLQQQQQQRQPQRYVVAGSSQQQQQ..QHQQQQQKRKRPKPELIEISPGQNETI!FESVSLKIRKAVDDNGTHKELKDFIIMGRRTDKALLRLTLARSANA IIII 1111 III III QH'QQQQ'Q'QORQ'P'QRQAV'AG'SQ'Q'QQQERMQQQQQLQRKRKPRPDIIEVSPSEGETWDGIYDKVRKAIRLDAAHSENKGHIKQGRRTHARLLRKLSKTANAI-lllllllllll-11-11111 11111 _ Li1 1 -II 101 LILQQIRTIIGEAGTCRHVTEKAALVVNDIDPLAKEEELTALLENKIEGGAGIVSTSIRTMPDGTQRARVRLPAKAAKALDGTKLRLGFCISRVKMAPP II IIII 11 I 111 M 11 11 11 11 11 III III III 11 III LMLEGVRKIIGDAGVSRLVTENGELLVVDIDPLATEEDIIAALDAKIGASAGVVSASIWELPDGSKRARIRLPVKSARQLEGLKLFLCDCVSKVRAAPP1I 201 PKEHLRCYRCLEHGHNARDCRSPVDRQNVCIRCGQEGHKAGTCMEEIRCGKCDGPHVIGDRTCDRSATQ 11 III 11 III MIii 11111 i 11 III PPERQRCERCL2MHASNRSTADRQNLCIRCGLTGHKARSCQNEIGHSECARSAQR1 FIG. 2. Alignment of 267 C-terminal amino acids from ORFi of RT1 and RT2. Identities are indicated by vertical lines; Cys motifs are underlined.

(data not shown). It is flanked at each end by a 17-bp presumed to be full length on the basis of restriction patterns duplication of a 28S rDNA sequence (55). The first of two (data not shown). Clone Ag1O6, containing a full-length RT2 ORFs is preceded by a 2,427-bp noncoding leader with element, was selected for sequencing. The complete se- several interesting features. Beginning at position 214, there quence of this RT2 element was determined (data not are 3.5 consecutive imperfect subrepeats approximately 500 shown). It is 6,731 bp long and flanked by the same 17 bp of nucleotides long. These are followed by a poly(T) tract 28S rDNA as the RT1 elements. The 1,134-bp 5' noncoding approximately 60 bases long. Beyond this tract is a sequence region is almost 1,300 bp shorter than the corresponding containing two 55-bp subrepeats separated by 160 bases. region from RT1, accounting for the difference in overall Each subrepeat is very G+C rich (67%) and contains several length between the two elements. Unlike RT1, this region is palindromes that potentially participate in secondary struc- devoid of any significant subrepeats or palindromes. How- ture. The last of the two subrepeats overlaps the beginning of ever, the nucleotide composition switches abruptly from ORF1 by 38 bp. segments rich in A+T to segments rich in G+C, particularly ORF1 spans positions 2428 to 4266, ending with the C in the mRNA-equivalent strand. termination codon TGA. A potential initiation codon is ORFi begins at position 1134 and ends at position 2858 found 55 bp from the beginning of the ORF. No other with the sequence TGA. The 5'-proximal ATG is not en- methionines are found for 75 codons downstream. This countered until position 1533, almost 400 bp downstream. finding, together with a favorable context for initiation (44), Although a purine (G) is found three positions upstream, it is suggests that the first methionine is the initiation codon. uncommon for the initiation codon to occur so far from the beginning at this AUG triplet would produce a beginning of the ORF (44). Sequence comparisons with other 595-amino-acid protein with a net charge of +19 at pH 7.5. copies of RT2 in the 5' noncoding region suggest that Agi06 Most striking is a preponderance of glutamine residues, 21% is a full-length copy, but the possibility that AgiO6 is (125 of 595) overall. Further, they are clustered at the center defective near the beginning of ORFi cannot be ruled out. of the ORF and occur in runs of up to 20 consecutive Translation from the 5' ATG codon would produce a 442- glutamines, reminiscent of the opa repeat (65). This repeat, amino-acid protein with a net change of +24 at pH 7.5. which is defined by repeated triplets of CA(G, A, C) and Although not as glutamine rich as its counterpart from RT1 encodes glutamine or histidine, is present in gene products (13 versus 21%), glutamine is a predominant residue in this that are expressed in a developmental or tissue-specific putative peptide. Three consecutive Cys motifs occur at the manner in organisms such as yeast, fruit flies, humans, mice, C terminus. and rats (31). At the carboxyl (C) terminus of ORF1 are three Overlapping ORFi by 44 bp in the +1 reading frame, Cys motifs, arranged in a pattern typical of sequences from ORF2 spans positions 2812 to 6477. Since no potential retroviral nucleocapsid proteins that recognize single- initiator codon occurs in the first 468 bp, ORF2 may be stranded nucleic acids (7). translated by ribosomal frame shifting. While retroviral The second ORF extends from position 4096 to position frame shifting is -1, in retrotransposon Tyl there is a 7731, ending with the termination codon TAA. The begin- precedent for +1 frame shifting (12). This is similar to the ning of ORF2 overlaps ORF1 by 171 bp in the -1 reading situation that has been described for Li elements from mice, frame. Methionines occur 31 and 40 codons from the begin- in which ORFi and ORF2 overlap in the +1 reading frame ning of the ORF, but neither one is in a favorable sequence (32). ORF2 contains regions that show homology to RT and context for initiation of translation. Thus, as can occur with integrase. Following ORF2 is a 3' noncoding region of 244 retroviruses, translation of ORF2 may occur by ribosomal bp, terminating with a putative polyadenylation signal and a frame shifting. This mechanism has also been proposed to poly(A) tail. account for translation of ORF2 in Ri elements (35). ORF2 Comparison of the ORFs. The overall identity between the contains a domain of RT homology followed by two distinct ORFi of RT1 and that of RT2 is 36% at the amino acid level. amino acid sequence motifs, one or both of which may The percent identity increases to 55% when only 267 C-ter- correspond to an integrase domain. minal amino acids are considered (Fig. 2). Thus, the C Following ORF2 is a 304-bp noncoding sequence ending in terminus, following the run of glutamines, is more highly a poly(A) tail. Interestingly, the poly(A) tail of all 5'- conserved than the amino (N) terminus. As noted by Jakub- truncated copies is distinct from that of full-length copies in czak et al. (35), the first of the three conserved Cys motifs that it is interrupted at the third position by a cytidine, found in several Li-like elements is invariant and is identical AAC(A). (55). Preceding the tail by 14 bp is a putative in spacing to motifs in retroviral nucleocapsid genes polyadenylation signal (AATAAA). (CX2CX4HX4C). The second and particularly the third may Nucleotide sequence of an RT2 element. The 5' end of the vary from the spacing of four residues between the second A. gambiae RT2 element was defined from sequence com- cysteine and the histidine. Interestingly, while the canonical parisons among different phage clones containing elements spacing is found within the first and second motifs of the VOL. 12, 1992 DISTINCT RETROTRANSPOSONS WITH IDENTICAL TARGET SITES 5105 ribosomal insertion elements RT1, RT2, and Ri from Bom- acid residues in the second motif of RT2 would be unex- byx mon, RlBm (67), and D. melanogaster, RlDm (35), the pected. spacing within the third motifs of RT1 and RT2 is identical to A comparison of the rates of synonymous (silent) and that of F, G, and jockey (three residues) and differs from that nonsynonymous (replacement) substitutions for both ORFs of the two Ri elements (eight residues). The Cys motifs of of RT1 and RT2 was conducted by using the method of Li et the ribosomal insertion elements R2Bm and R2Dm, as well al. (45). In ORF2, the rate of synonymous substitutions as the trypanosomatid site-specific elements CZAR, (expressed in terms of 10-' substitutions per site per year + SLACS, and INGI, more closely resemble the motif found in the standard error) was sixfold higher than the rate of factor TFIIIA (63). This type of Cys motif nonsynonymous substitutions (2.1 + 0.2 versus 0.35 + 0.01). interacts with double-stranded DNA (7). For ORF1, when the comparison was based on alignment of Overall identity between ORF2 of RT1 and that of RT2 is the entire coding region, no significant difference in rates of high, 60% at the amino acid level. Except for a glutamine- substitution was observed, probably owing to the lack of a rich stretch present only in RT2 near the N terminus (31 of 55 meaningful alignment. This suggests an accelerated rate of residues, including runs of 10 and 6 glutamines), this high evolution at the N terminus. However, if the comparison level of identity is upheld throughout the length of ORF2. was limited to the last 816 bp of the ORF, again a sixfold Centrally located in both ORFs is a region with extensive difference in rates was found (2.0 + 0.4 versus 0.35 + 0.03). homology to RT, as has been reported for numerous other The absolute rates are based on an estimated divergence retrotransposons (66, 68). By using the Genetics Computer time of 80 million years between mammals. The rate of Group program Pileup (21), an amino acid alignment was substitution at silent sites in nuclear genes of mammals is at created with the RT domains of RT1, RT2, and eight other least three times lower than that observed in Drosophila non-LTR retrotransposons-RlDm (35), RlBm (67), R2Dm species (60), and since rates of substitution in retroelements (35), R2Bm (13), I (26), Ingi (42), and two from A. gambiae, are generally higher than in nuclear genes because of error- Ti (8) and T2 (5). The alignment confirmed that RT1 and RT2 prone RT (24), the absolute rates are less meaningful than are more closely related to one another (75% identity over the sixfold relative difference between the types of substitu- 280 residues) than either is to any other (28 to 40% identity tions, indicating selection for protein function. over about 270 residues; data not shown). This RT alignment RT gene expression. Developmental Northern blots were also suggested an especially close evolutionary relationship prepared by using both total RNA and mRNA isolated from between Ri ribosomal insertion elements and RT1-RT2. The embryos, larvae, pupae, and adults of the G3 colony of A. 30% of residues shared between RT1 or RT2 and the gambiae. These were probed separately with an internal elements that do not specifically insert into rDNA are fragment of an RT1 clone and an RT2 clone (see Materials comparable to the 29% shared between RT1 or RT2 and the and Methods) and an A. gambiae actin clone (57) as a R2 ribosomal insertion elements. In contrast, RT1 or RT2 positive control. While a signal was detected with the actin and the Ri ribosomal insertion elements share 39% of probe, no signal was detected with the RT probes, even after residues in this domain. Supporting this relationship is a prolonged exposure of the blots to film (data not shown). similarity in the target site duplications of RT1-RT2 and Ri Distribution of RT1 and RT2 in the A. gambiae complex. (55). RT1-RT2 and Ri insert in opposite orientations in the The presence of RT1 and RT2 elements in members of theA. rDNA, and 9 bp of the RT1-RT2 target site duplication gambiae complex was assessed by Southern blotting. Total (TATCCCTGT) are identical but reversed in orientation with genomic DNA was isolated from individual adult mosqui- respect to 9 bp of the target site duplication of Ri. It is toes, digested with Sall, fractionated on agarose gels, blot- important to note that all individuals of A. gambiae tested ted, and probed individually with internal fragments of have another, 5-kb insertion in approximately 15% of the 28S cloned RT1, RT2, and the xanthine dehydrogenase (Xdh) coding sequences, 5' to the RT1-RT2 insertion site (16). gene (19) from A. gambiae. Since RT1 and RT2 sequences These insertions, which do not vary in abundance, may do not cross-hybridize on Southern blots, each probe was correspond to Ri and/or R2 elements, but no example has used simultaneously to produce the results presented in Fig. been cloned. 3. N terminal to the RT domain in both RT1 and RT2 are two Figure 3A shows the results of comparisons among vari- potential nucleic acid-binding motifs, one or both of which ous colonies and field-collected specimens. The Xdh probe might correspond to an integrase domain. Like the Cys motif produced a single 5.5-kb band of hybridization in all lanes. found in retroviral integrases, HX3HX22-32CX2CX108 130ER Strong bands of hybridization of the expected size for RT2 (38), the first Cys motif encountered has the sequence were detected from each A. gambiae, A. arabiensis, and A. HX5HX25CX2CX9=101ER. A similar motif has been re- quadriannulatus colony and field specimen assayed. No RT2 ported from the trypanosomatid site-specific elements, al- signals were detected from A. merus colonies or field spec- though it occurs at the 5' end of ORF2, N terminal to the RT imens. Although noA. melas specimens were assayed, RT2 domain (63). Encompassed by the first Cys motif, the second does occur in this species because a region of this element potential Cys motif in RT1, CX2CX7HHX4C, shares two has been successfully cloned by PCR and sequenced from cysteines with the first motif. In RT2, the two histidines are the BAL colony (10). replaced by two aspartic acid residues. This second Cys The distribution pattern of RT1 was complex. An intense motif more closely resembles the pattern of cysteine and band of hybridization of the expected size for RT1 was histidine residues that characterizes the Cys motifs in ORFi. detected from the A. gambiae colony assayed. In addition, Interestingly, the second Cys motif matches a consensus, faint RT1 signals detected from both A. arabiensis colonies CX1l3CX7sHX4C, derived from similar motifs found in the indicated the presence of RT1 in lower copy number in this same relative location in eight other non-LTR retrotrans- species. No RT1 signal was detected from A. quadriannula- posons (35). The relative importance of the first and second tus or either of two A. merus colonies. Again, A. melas was Cys motifs is not known. If they play a role in target site not assayed, but the occurrence of RT1 in this species has recognition, however, the substitution of two negatively been demonstrated by PCR amplification and sequencing charged histidine residues by two positively charged aspartic (10). A faint RT1 hybridization signal was detected from a 5106 BESANSKY ET AL. MOL. CELL. BIOL.

1 2 3 4 5 6 7 8 9 101112 TABLE 1. Copy numbers of RT1 and RT2 in selected colony and A field mosquitoese 5.5 0- --Xdh Copy no. Mosquito and colony or line RT1 RT2 '-pk,. -RT2 2.8 o- -! A. gambiae colonies F IS-.1111.[I., r.- G3 20 4 N * -RT1 SUA 76 18 A. arabiensis colonies F F F F F F F F F M M M M ARZ 2 32 B GMAL 1 53 A. quadiiannulatus SQUAD ob 7 5-5 - -Xdh A. merus colonies V12 Ob ob ZULU ob ob 2.8 - - ~~~~-:- -RT2 A. arabiensis Kl line 3 24 1.9 0- ..: ob .iw, 'Momr- -I' --o RT1 A. quadriannulatus CHIL line 10 PI A. gambiae lines FIG. 3. (A) Hybridization patterns of Xdh, RT1, and RT2 to 25, male 24 10 SalI-digested genomic DNA extracted from individual adult fe- 28, male 4 10 males. Lanes: 1 to 6, laboratory specimens of A. gambiae (SUA) 34 (lane 1), A. arabiensis (ARZAG and GMAL) (lanes 2 and 3), A. Sib A 38 23 quadriannulatus (SQUAD) (lane 4), andA. merus (V12 and ZULU) Sib B 37 19 (lanes 5 and 6); 7 to 9, progeny of wild-caught A. gambiae females Sib C 4 19 (no. 25, 28, and 34); 10 to 12, wild specimens ofA. arabiensis (lane Sib D 5 20 10), A. quadriannulatus (lane 11), and A. merus (lane 12). (B) Sib E 4 20 Hybridization patterns of Xdh, RT1, and RT2 to SalI-digested Sib F 31 17 genomic DNA extracted from progeny of wild-caught female 34. F, Sib G 33 18 female; M, male. Sib H 39 20 Sib I 50 26 Sib J 8 30 Sib K, male 7 28 Sib L, male 10 28 Sib M, male 80 22 Kenyan specimen of A. arabiensis, and no signal was Sib N, male 79 20 detected from A. quadriannulatus and A. merus field speci- mens from Zimbabwe and Kenya, respectively. Three field a Individuals are female unless otherwise indicated. Copy number refers to specimens of A. gambiae, the daughters of blood-fed fe- haploid as determined by the element/Xdh activity ratio, assuming no dosage compensation. Since males are heterogametic, copy number estimates males captured at the same time from one Kenyan village, based on this ratio were doubled. Sib, sibling. revealed a surprising pattern of hybridization with the RT1 bAbsence of elements was supported by PCR. probe. One, from family 25, showed a moderate hybridiza- tion signal. The next, from family 28, showed a signal no stronger than that of the Xdh band. A third, from family 34, ant element copies, were omitted from the analysis. The showed an intense hybridization signal. results for both blots are given in Table 1. To investigate this phenomenon more closely, a Southern The copy numbers of RT1 and RT2 in two laboratory blot containing genomic DNA of siblings from family 34 was colonies ofA. gambiae varied about fourfold, from 20 to 76 prepared as before and probed with the Xdh, RT1, and RT2 copies of RT1 and from 4 to 18 copies of RT2. These probes (Fig. 3B). The hybridization signal with the RT2 numbers are consistent with the 10 to 30 copies of RT2 probe was uniform, taking into account the fact that sons, estimated from the offspring of field-collected females. How- with about half of the signal intensity, have only one X ever, a 20-fold variation in copy number (4 to 80) was chromosome, to which the rRNA loci map (16). With the observed for RT1 among siblings of a single female, field- RT1 probe, two different signal intensities were observed. collected female 34. It is possible that such large copy The bands detected in four of the daughters appeared to have number fluctuations would also have been detected in labo- about half of the intensity of bands detected in another five ratory colonies, had more individuals been assayed. While daughters. Likewise, the bands detected in two sons ap- RT1 predominates over RT2 in wild and laboratory-reared peared to have roughly half of the intensity of those detected A. gambiae, the reverse applies toA. arabiensis, with 1 to 3 in another two sons. We were able to arrive at a more copies of RT1 versus 24 to 53 copies of RT2. The A. quantitative estimate of copy number by using for each lane quadriannulatus specimens represent an extreme case, in the Xdh band as a single-copy gene reference and analyzing which RT2 elements are present in low copy number (7 to 10) the blot directly with an instrument that senses and records and RT1 elements are not detected. As noted earlier, both beta emissions. After 4 h of data collection, the total activity element families are apparently absent from A. merus. (expressed as total counts) recorded from RT1 or RT2 in a given lane was divided by the total activity recorded from Xdh in that lane, after correction for the background in each DISCUSSION case. Since males are heterogametic, activity ratios were In this study, we characterized two distinct families of doubled to estimate the copy number per haploid genome. non-LTR retrotransposons from the mosquito A. gambiae Occasional very faint bands detected by the RT1 and RT2 that share a specific insertion site in the 28S coding region of probes in unexpected positions, probably the result of devi- the rDNA. These two element families, RT1 and RT2, not VOL. 12, 1992 DISTINCT RETROTRANSPOSONS WITH IDENTICAL TARGET SITES 5107

only share all of the structural features that typify elements timing of transcription have important implications for the of this class but are more similar to each other at the ability of the element to persist and spread in the genome. To nucleotide and amino acid levels within the coding regions the extent that the acquisition of new 5' ends with different than are any two other families of such elements described to transcriptional regulation can lead to the establishment of date, including other site-specific ribosomal insertion ele- new subfamilies or families of elements in the same organ- ments. The protein sequence divergence between RT1 and ism, this mechanism might partially underly the proliferation RT2 averaged across ORF2 compares to that found within of many related families in the genome. Unfortunately, we the conserved 3' end of ORF2 among members of the have been unable to detect transcripts from either RT1 or mammalian Li family derived from humans and mice (32). RT2 elements at any developmental stage of the mosquito or We have shown that while the ORF2 regions are more from the A. gambiae cell line. Neither RT1 nor RT2 shares closely related than the ORFi regions of RT1 and RT2, any sequence similarity to the motifs present at the considerable similarity can still be found between ORFls, 5' ends of F, jockey, and other D. melanogaster non-LTR even outside of the conserved nucleic acid-binding (Cys) retrotransposons (52). It is possible that the level of tran- domains. The relatively high degree of conservation among scription was lower than was measurable by Northern blot- the ORFs and the strong bias toward silent versus replace- ting or that the window of active transcription was limited to ment substitutions are consistent with the operation of a very narrow time period at a given developmental stage. selective constraints associated with their proposed roles in Given the extremely close evolutionary relatedness of encoding functional nucleic acid-binding proteins, RT, and members of the A. gambiae complex, the uneven distribu- integrases. That they are more similar to the Ri ribosomal tion of RT1 and RT2 in both wild and colonized specimens is insertion elements than to other non-LTR retrotransposons, surprising. The only species that apparently lacked se- both throughout the RT domain and in target sequence, is quences homologous to either element family was A. merus, also consistent with a hypothesis of divergence of RT1 and and A. quadriannulatus harbored only RT2. These results RT2 from an Ri ancestral sequence (or vice versa). could be attributed to either a sampling artifact or sequence It is difficult to reconcile the relative similarity among divergence, rather than true absence. A sampling artifact coding regions with the lack of any obvious relationship cannot be ruled out but seems unlikely given the consistent among 5' and 3' noncoding regions. This phenomenon has failure to detect these element families by either PCR or been previously observed within the mammalian Li family Southern analysis by using specimens that originated from of elements, where the 5' and 3' ends diverge so much more geographically distant regions, Kenya and Zululand. Se- rapidly than the coding regions that there are species-specific quence divergence is more difficult to discount, however, length differences at the 3' ends, and no meaningful se- even though the probes included one of the most highly quence alignment at either end is possible between members conserved regions of the coding portion of these elements from distantly related species (64). In fact, among the Li and non-LTR retrotransposons generally. Sequence compar- elements of mice, three alternative 5' noncoding regions that isons have been made among PCR fragments of RT2 cloned lack sequence similarity have been described, the so-called from four members of the species complex (10). They show A, F, and V types (39, 46, 54). Types A and F are composed that of the last 174 nucleotides of ORF2, there are no intra- of tandem repeats of 208 and 206 bp, respectively, reminis- nor interspecific differences among elements from A. gam- cent of the 500-bp tandem-repeat structure of the 5' end of biae, A. arabiensis, and A. quadriannulatus but that ele- RT1. Type V lacks any tandem repetitive structure, as does ments from A. melas differ at 54 (31%) of the positions. If the 5' end of RT2. Interestingly, F and V sequences are there were elements in A. merus that showed a comparable found separate from Li elements in the mouse genome. degree of divergence over the 2- to 3-kb region represented Although human Li elements lack a 5' repeat structure, the by the probes, then it is possible that washing of Southern 5' end of rat Li elements also contains a 600 blots at high stringency would remove the probe. Experi- bp long (29). Apart from the mammalian Li family, two ments are in progress to test this possibility. site-specific trypanosomatid non-LTR retrotransposons con- The remaining species assayed contained various copy taining a 185-bp tandem repeat at the 5' end have been numbers of both subfamilies. Because of the existence of described, although related elements that lack this structural various polyploid tissues in the adult, the potential under- feature also exist (2). Thus, it appears that this type of replication of interrupted rRNA genes in those tissues (4), element may be able to recombine at the 5' end, by an the possibility of somatic elimination of heterochromatic undefined mechanism, with unrelated genomic sequences sequences (40), and questions about dosage compensation in (39) and may do so relatively frequently in an evolutionary male mosquitoes, the copy numbers in Table 1 are not exact. time frame. We intend to establish whether the 5' tandem Nevertheless, they reveal striking differences, particularly repeats exhibited by RT1 are found by themselves in the A. with respect to RT1 elements, not only between species and gambiae genome. between colonies of the same species, but also among the The type A repeats of mouse Li and the repeats of rat Li offspring of individual adult females from the field. These are able to promote transcription (29, 59). However, a differences in copy number of RT1 between isofemale lines tandem repetitive structure is not critical for transcription of suggest the coexistence in a single population of several X non-LTR retrotransposons, since sequences within the first chromosomes that differ with respect to the copy number of 100 bp of the human Li element are also capable of promot- this element. Within isofemale line 34, males and females ing transcription (61), as are sequences from D. melanogas- both fall into two groups: high copy number, ranging from 31 ter jockey and F elements (52, 53). Differences in both to 80, and low copy number, ranging from 4 to 10. Thus, structure and sequence do seem to have important effects on Mendelian inheritance may be sufficient to explain the regulation of transcription. For example, although the hu- immediate results of this particular mating. Interestingly, the man Li promoter is active in rat and mouse cells, its activity copy number fluctuations detected in RT2 were much lower, appears to be both cell type and stage specific (61). In particularly within isofemale line 34, in which no striking contrast, jockey seems to be transcribed at high levels at all differences in copy number were observed. developmental stages (53). Restrictions on the level and The distribution pattern of RT1 and RT2 needs to be 5108 BESANSKY ET AL. MOL. CELL. BIOL. understood in the context of their role as elements that insert niche in the genome. Thus, ORF1 differences may have less specifically into the 28S coding region of a percentage (from to do with species specificity than with functional specificity. 2 to 20%) of the 500 to 600 rRNA genes. In D. melanogaster, individuals with 50% of rRNA genes interrupted in the 28S ACKNOWLEDGMENTS region, known as bobbed 8 mutants, have significantly This work was supported by World Health Organization/Tropical longer development times, presumably because of insuffi- Disease Research grants 890534 and 900573 and by the National cient rRNA. Studies of rDNA transcription in these mutants Center of Infectious Diseases of the U.S. Centers for Disease have shown that interrupted genes are not usually tran- Control. scribed and that most of those that are transcribed are We gratefully acknowledge the support of staff of the CDC processed or degraded (36). We assume that a similar Biotechnology Core Facility Branch, B. Holloway, M. Rasmussen, phenomenon occurs inA. gambiae. This phenotypic effect is and E. George, for synthesis of oligonucleotide sequencing primers surely subject to selective pressures, particularly in the and assistance with the Betascope Analyzer and S. McKneally for computer assistance. We thank an anonymous reviewer whose context of population flushes and crashes associated with helpful comments improved the manuscript. rainy and dry seasons and ecologically marginal situations in the field. Other important considerations include recombina- REFERENCES tion mechanisms typical of tandemly repeated DNA, such as 1. Abad, P., C. Vaury, A. Pelisson, M.-C. Chaboissier, I. Busseau, rRNA genes. Intrachromosomal recombination would result and A. Bucheton. 1989. A long interspersed repetitive element- in rapid loss of part of a tandem array on a single chromo- the I factor of Drosophila teissieri-is able to transpose in some, and unequal crossing over between chromosomes different Drosophila species. Proc. Natl. Acad. Sci. USA 86: would result in expansion of the copy number on one 8887-8891. 2. Aksoy, S. 1991. Site-specific retrotransposons of the trypanoso- chromosome and deletion on the other. Assuming that some matid protozoa. Parasitol. Today 7:281-285. of the elements in these families are still capable of amplifi- 3. Back, E., E. V. Meir, F. Mueller, D. Schaller, H. Neuhaus, P. cation by retroposition and that some gene flow between Aeby, and H. Tobler. 1984. Intervening sequences in the ribo- taxa still occurs, albeit at a very low frequency (18), the only somal RNA genes ofAscaris lumbricoides: DNA sequences at conclusion that can safely be made is that the copy number junctions and genomic organization. EMBO J. 3:2523-2529. of RT1 and RT2 is unstable within this sibling species 4. Beckingham, K., and N. Thompson. 1982. Under-replication of complex. intron+ rDNA cistrons in polyploid nurse cell nuclei of Calli- In the face of such flux, it is remarkable that RT1 and phora erythrocephala. Chromosoma 87:177-196. RT2 5. Bedell, J. A., and N. J. Besansky. Unpublished data. appear to have coexisted for millions of years in competition 6. Berg, D. E., and M. M. Howe. 1989. Mobile DNA. American for the same conserved insertion site. Support for this Society for Microbiology, Washington, D.C. assertion comes from a sequence comparison between func- 7. Berg, J. M. 1990. Zinc fingers and other metal-binding domains. tional I factors from D. melanogaster and D. teissieri (1). J. Biol. Chem. 265:6513-6516. Sequences that hybridize to I-factor probes have been de- 8. Besansky, N. J. 1990. A retrotransposable element from the tected in all but 1 of the 21 members of the D. melanogaster mosquito Anopheles gambiae. Mol. Cell. Biol. 10:863-871. group and four species outside it, leading to the 9. Besansky, N. J. 1990. Evolution of the Ti family in suggestion the Anopheles gambiae complex. Mol. Biol. Evol. 7:229-246. that I elements are evolutionarily old components of the 10. Besansky, N. J., and J. A. Bedell. Unpublished data. of these species (28). The level of single-copy 11. Besansky, N. J., D. Mills Hamm, and F. H. Collins. Unpublished nuclear DNA divergence between D. melanogaster and D. data. teissieri has been determined by DNA-DNA hybridization to 12. Boeke, J. D., and K. B. Chapman. 1991. Retrotransposition be about a 7% base pair mismatch (14). Assuming a conver- mechanisms. Curr. Opin. Cell Biol. 3:502-507. sion of 1.7% mismatch per million years (14), D. melanogas- 13. Burke, W. D., C. C. Calalang, and T. H. Eickbush. 1987. The ter may have split from D. teissieri about 4 million years ago. site-specific ribosomal insertion element type II ofBombyx mori (R2Bm) contains the coding sequence for a reverse tran- The overall nucleotide mismatch between I factors of D. scriptase-like enzyme. Mol. Cell. Biol. 7:2221-2230. melanogaster and D. teissien is 15% (1), twice as high as 14. Caccone, A., G. D. Amato, and J. R. Powell. 1988. Rates and measured for single-copy nuclear DNA, only a small fraction patterns of scnDNA and mtDNA divergence within the Dro- of which is composed of coding sequences. The overall sophila melanogaster subgroup. Genetics 118:671-683. nucleotide mismatch between the RT1 and RT2 coding 15. Collins, F. H., M. A. Mendez, M. 0. Rasmussen, P. C. Mehaffey, regions is 39%. Therefore, discounting the possibility of N. J. Besansky, and V. Finnerty. 1987. A ribosomal RNA gene horizontal transmission, which we have no experimental probe differentiates member species of the Anopheles gambiae evidence to support, and assuming comparable rates of DNA complex. Am J. Trop. Med. Hyg. 37:37-41. divergence in anopheles, RT1 and RT2 may have coexisted 16. Collins, F. H., S. M. Paskewitz, and V. Finnerty. 1989. Ribo- somal RNA genes of the Anopheles gambiae complex. Adv. for at least 4 million and perhaps as many as 10 million years. Dis. Vector Res. 6:1-28. These elements may owe their lengthy coexistence to two 17. Coluzzi, M., V. Petrarca, and M. A. Di Deco. 1985. Chromo- factors. First, because of major structural and sequence somal inversion intergradation and incipient speciation in differences in the 5' noncoding region, they are almost Anopheles gambiae. Boll. Zool. 52:45-63. certainly regulated quite differently. These differences in 18. Coluzzi, M., A. Sabatini, V. Petrarca, and M. A. Di Deco. 1979. regulation may be reflected in the copy number differences Chromosomal differentiation and adaptation to human environ- shown in Table 1. The second factor concerns the rapidly ments in theAnophelesgambiae complex. Trans. R. Soc. Trop. evolving N terminus of ORF1. Others have suggested that Med. Hyg. 73:483-497. the first ORF in non-LTR 19. Crews-Oyen, A., D. Mills Hamm, and F. H. Collins. Unpub- retrotransposons serves a species- lished data. specific function, since it seems to be much less conserved 20. Deragon, J.-M., D. Sinnett, and D. Labuda. 1990. Reverse among elements from different species than the second, transcriptase activity from human embryonal carcinoma cells RT-encoding ORF. We propose that the different N termini NTera2D1. EMBO J. 9:3363-3368. of the respective ORFls encode specialized functions unique 21. Devereux, J., P. Haeberli, and 0. Smithies. 1984. A comprehen- to each family that allow each to coexist in an individual sive set of sequence analysis programs for the VAX. Nucleic VOL. 12, 1992 DISTINCT RETROTRANSPOSONS WITH IDENTICAL TARGET SITES 5109

Acids Res. 12:387-395. 42. Kimmel, B. E., 0. K. ole-Moiyoi, and J. R. Young. 1987. Ingi, a 22. Di Nocera, P. P., F. Graziani, and G. Lavorgna. 1986. Genomic 5.2-kb dispersed sequence element from Trypanosoma brucei and structural organization of Drosophila melanogaster G ele- that carries half of a smaller mobile element at either end and ments. Nucleic Acids Res. 14:675-691. has homology with mammalian LINEs. Mol. Cell. Biol. 7:1465- 23. Doolittle, R. F., D.-F. Feng, M. S. Johnson, and M. A. McClure. 1475. 1989. Origins and evolutionary relationships of retroviruses. Q. 43. Kinsey, J. A. 1990. Tad, a LINE-like of Rev. Biol. 64:1-30. Neurospora, can transpose between nuclei in heterokaryons. 24. Dougherty, J. P., and H. M. Temin. 1988. Determination of the Genetics 126:317-323. rate of base-pair substitution and insertion mutations in retro- 44. Kozak, M. 1984. Compilation and analysis of sequences up- virus replication. J. Virol. 62:2817-2822. stream from the translational start site in eukaryotic mRNAs. 25. Evans, J. P., and R. D. Pahmiter. 1991. Retrotransposition of a Nucleic Acids Res. 12:857-872. mouse L-1 element. Proc. Natl. Acad. Sci. USA 88:8792-8795. 45. Li, W.-H., C.-I. Wu, and C.-C. Luo. 1985. A new method for 26. Fawcett, D. H., C. K. Lister, E. Kellett, and D. J. Finnegan. estimating synonymous and nonsynonymous rates of nucleotide 1986. Transposable elements controlling I-R hybrid dysgenesis substitution considering the relative likelihood of nucleotide and in D. melanogaster are similar to mammalian LINEs. Cell codon changes. Mol. Biol. Evol. 2:150-174. 47:1007-1015. 46. Loeb, D. D., R. W. Padgett, S. C. Hardies, W. R. Shehee, M. B. 27. Feinberg, A. P., and B. Vogelstein. 1984. A technique for Comer, M. H. Edgell, and C. A. Hutchison HI. 1986. The radiolabeling DNA restriction endonuclease fragments to high sequence of a large LlMd element reveals a tandemly repeated specific activity. Anal. Biochem. 137:266-267. 5' end and several features found in retrotransposons. Mol. 28. Finnegan, D. J. 1989. The I factor and I-R hybrid dysgenesis in Cell. Biol. 6:168-182. Drosophila melanogaster, p. 503-517. In D. E. Berg and M. M. 47. MacDonald, R. J., G. H. Swift, A. E. Przybyla, and J. M. Howe (ed.), Mobile DNA. American Society for Microbiology, Chirgwin. 1987. Isolation of RNA using guanidinium salts. Washington, D.C. Methods Enzymol. 152:219-227. 29. Furano, A. V., S. M. Robb, and F. T. Robb. 1988. The structure 48. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular of the regulatory region of the rat Li (LlRn, long interspersed cloning: a laboratory manual. Cold Springs Harbor Laboratory, repeated) DNA family of transposable elements. Nucleic Acids Cold Springs Harbor, N.Y. Res. 16:9215-9231. 49. Martin, S. L. 1991. Ribonucleoprotein particles with LINE-1 30. Gabriel, A., and J. D. Boeke. 1991. Reverse transcriptase RNA in mouse embryonal carcinoma cells. Mol. Cell. Biol. encoded by a retrotransposon from the trypanosomatid 11:4804-4807. Crithidiafasciculata. Proc. Natl. Acad. Sci. USA 88:9794-9798. 50. Mathias, S. L., A. F. Scott, H. H. Kazazian, Jr., J. D. Boeke, and 31. Grabowski, D. T., J. P. Carney, and M. R. Kelley. 1991. A A. Gabriel. 1991. Reverse transcriptase encoded by a human Drosophila gene containing the opa repetitive element is exclu- transposable element. Science 254:1808-1810. sively expressed in adult male abdomens. Nucleic Acids Res. 51. McClure, M. A. 1991. Evolution of by acquisition 19:1709. or deletion of retrovirus-like genes. Mol. Biol. Evol. 8:835-856. 32. Hutchison, C. A., II, S. C. Hardies, D. D. Loeb, W. R. Shehee, 52. Minchiotti, G., and P. P. Di Nocera. 1991. Convergent transcrip- and M. H. Edgeli. 1989. LINEs and related retroposons: long tion initiates from oppositely oriented promoters within the 5' interspersed repeated sequences in the eucaryotic genome, p. end regions of Drosophila melanogaster F elements. Mol. Cell. 593-617. In D. E. Berg and M. M. Howe (ed.), Mobile DNA. Biol. 11:5171-5180. American Society for Microbiology, Washington, D.C. 53. Mizrokhi, L. J., S. G. Georgieva, and Y. V. Ilyin. 1988. Jockey, 33. Ivanov, V. A., A. A. Melnikov, A. V. Siunov, L. I. Fodor, and a mobile Drosophila element similar to mammalian LINEs, is Y. V. Ilyin. 1991. Authentic reverse transcriptase is coded by transcribed from the internal promoter by polymerase II. Cell jockey, a mobile Drosophila element related to mammalian 54:685-691. LINEs. EMBO J. 10:2489-2495. 54. Padgett, R. W., C. A. Hutchison m, and M. H. Edgell. 1988. The 34. Jakubezak, J. L., W. D. Burke, and T. H. Eickbush. 1991. F-type 5' motif of mouse Li elements: a major class of Li Retrotransposable elements Ri and R2 interrupt the rRNA termini similar to the A-type in organization but unrelated in genes of most insects. Proc. Natl. Acad. Sci. USA 88:3295- sequence. Nucleic Acids Res. 16:739-749. 3299. 55. Paskewitz, S. M., and F. H. Collins. 1989. Site-specific ribo- 35. Jakubczak, J. L., Y. Xiong, and T. H. Eickbush. 1990. Type I somal DNA insertion elements in Anopheles gambiae and A. (R1) and type II (R2) ribosomal DNA insertions of Drosophila arabiensis: nucleotide sequence of gene-element boundaries. melanogaster are retrotransposable elements closely related to Nucleic Acids Res. 17:8125-8133. those of Bombyx mon. J. Mol. Biol. 212:37-52. 56. Pelisson, A., D. J. Finnegan, and A. Bucheton. 1991. Evidence 36. Jamfich, M., and 0. L. Miller, Jr. 1984. The rare transcripts of for retrotransposition of the I factor, a LINE element of interrupted rRNA genes in Drosophila melanogaster are proc- Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 88:4907- essed or degraded during synthesis. EMBO J. 3:1541-1545. 4910. 37. Jensen, S., and T. Heidmann. 1991. An indicator gene for 57. Salazar, C., D. Mills Hamm, C. B. Beard, and F. H. Collins. detection of germline retrotransposition in transgenic Droso- Unpublished data. phila demonstrates RNA-mediated transposition of the LINE 1 58. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc- element. EMBO J. 10:1927-1937. ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. 38. Johnson, M. S., M. A. McClure, D. F. Feng, J. Gray, and R. F. USA 74:5463-5467. Doolittle. 1986. Computer analysis of retroviral pol genes: 59. Severynse, D., C. Hutchison m, and M. Edgell. 1989. Transcrip- assignment of enzymatic functions to specific sequences and tional regulatory sequences within the LlMd family. Presented homologies with nonviral enzymes. Proc. Natl. Acad. Sci. USA at the LINE-1 Related Transposable Elements Workshop, 83:7648-7652. Washington, D.C. 9-11 October. 39. Jubier-Maurin, V., G. Cuny, A.-M. Laurent, L. Paquereau, and 60. Sharp, P. M., and W.-H. Li. 1989. On the rate of DNA sequence G. Roizes. 1992. A new 5' sequence associated with mouse Li evolution in Drosophila. J. Mol. Evol. 28:398-402. elements is representative of a major class of Li termini. Mol. 61. Swergold, G. 1990. Identification, characterization, and cell Biol. Evol. 9:41-55. specificity of a human LINE-1 promoter. Mol. Cell. Biol. 40. Karpen, G. H., and A. C. Spradling. 1990. Reduced DNA 10:6718-6729. polytenization of a minichromosome region undergoing posi- 62. Toneguzzo, F., S. Glynn, E. Levi, S. Jmolsness, and A. Hayday. tion-effect variegation in Drosophila. Cell 63:97-107. 1988. Use of a chemically modified T7 DNA polymerase for 41. Kerrebrock, A. W., R. Srivastava, and S. A. Gerbi. 1989. manual and automated sequencing of supercoiled DNA. Bio- Isolation and characterization of ribosomal DNA variants from Techniques 6:460-469. Sciara coprophila. J. Mol. Biol. 20:1-13. 63. Villanueva, M. S., S. P. Williams, C. B. Beard, F. R. Richards, 5110 BESANSKY ET AL. MOL. CELL. BIOL.

and S. Aksoy. 1991. A new member of a family of site-specific loci in D. melanogaster. Cell 40:55-62. retrotransposons is present in the spliced leader RNA genes of 66. Xiong, Y., and T. H. Eickbush. 1988. Similarity of reverse Trypanosoma cruzi. Mol. Cell. Biol. 11:6139-6148. transcriptase-like sequences of viruses, transposable elements, 64. Weiner, A. M., P. L. Deininger, and A. Efstratiadis. 1986. and mitochondrial introns. Mol. Biol. Evol. 5:675-690. Nonviral retroposons: genes, , and transposable 67. Xiong, Y., and T. H. Eickbush. 1988. The site-specific ribosomal elements generated by the reverse flow of genetic information. DNA insertion element RlBm belongs to a class of non-long- Annu. Rev. Biochem. 55:631-661. terminal-repeat retrotransposons. Mol. Cell. Biol. 8:114-123. 65. Wharton, K. A., B. Yedvobnick, V. G. Finnerty, and S. Arta- 68. Xiong, Y., and T. H. Eickbush. 1990. Origin and evolution of vanis-Tsakonas. 1985. opa: a novel family of transcribed repeats retroelements based upon their reverse transcriptase sequences. shared by the Notch locus and other developmentally regulated EMBO J. 9:3353-3362.