Multiple Non-LTR Retrotransposons in the Genome of Arabidqpsis Thulium
Total Page:16
File Type:pdf, Size:1020Kb
Copyright 0 1996 by the Genetics Society of America Multiple Non-LTR Retrotransposons in the Genome of ArabidqPsis thulium David A. Wright,* Ning Ke," Jan Smalle,t" Brian M. Hauge,t'2Howard M. Goodmant and Daniel F. Voytas* *Department of Zoology and Genetics, Iowa State University, Ames, Iowa 50011 and tDepartment of Genetics, Haward Medical School and Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 021 14 Manuscript received June 13, 1995 Accepted for publication October 14, 1995 ABSTRACT DNA sequence analysis near the Arabidopsis thaliana AB13 gene revealed the presence of a non-LTR retrotransposon insertion thatwe have designated Tal 1-1.This insertion is 6.2 kb in length and encodes two overlapping reading frames with similarity to non-LTR retrotransposon proteins, including reverse transcriptase. A polymerase chain reaction assaywas developed based on conserved amino acid sequences shared between the Tall-1 reverse transcriptase and those of non-LTR retrotransposons from other species. Seventeen additionalA. thaliana reverse transcriptases were identified that range in nucleotide similarity from 4848% (Ta12-Ta28). Phylogenetic analyses indicated that the A. thaliana sequences are more closely related to each other than to elements from other organisms, consistent with the vertical evolution of these sequences over mostof their evolutionary history. One sequence, Ta17,is located in the mitochondrial genome. The remaining are nuclear andof low copy number among 17 diverse A. thaliana ecotypes tested, suggesting that they are not highly active in transposition. The paucity of retrotransposons and the small genome size of A. thaliana support the hypothesis that most repetitive sequences have been lost from the genome and that mechanisms may exist to prevent amplification of extant element families. RABIDOPSIS thaliana has the smallest known ge- element groups-the Tyl/copia andthe Ty3/gypsy A nomeamong higher plants (LEUTWILERet al. group elements-named after representative retro- 1984). With only lo8 bp, it lies at one end of the spec- transposon families from Saccharomycescereuisiae and trum of plant genome size that extends as high as 10" Drosophilamelanogaster (XIONG and EICKBUSH 1990). bp for some monocot species (BENNETTand SMITH The Tyl/copia and Ty3/gypsy groups are distinguished 1976). An unusual feature of the A. thaliana genome is by the order of proteins encoded by their pol genes the scarcity ofinterspersed repetitive DNA (LEUTWILER and similarities among the sequences of their reverse et al. 1984; PRUITI and MEYEROMTZ1986). Although transcriptases. highly repeated andmoderately repeated sequencesto- We and others have previously assessed the distribu- gether make up -20% of the genome, the interspersed tion and diversity of Tyl/copia group elements among repeat fraction constitutes only -2% of the total nu- a wide variety of plant species (FLAVELLet al. 1992a,b; clear DNA (MEYEROWITZ1992). For most eucaryotes, VOYTASet al. 1992; HIROCHIKAand HIROCHIKA, 1993). interspersed repeat sequences are typically mobile ge- This was accomplished using a polymerase chain reac- netic elements, the most abundant of which are the tion assay that specifically amplified Tyl/copia group retrotransposons, mobile elements that replicate by re- reverse transcriptases. Species representing each Divi- verse transcription of an mRNA intermediate. sion of the plant kingdom were found to carry Tyl/ Two major classesof retrotransposons have been copia group elements, and many species were found to identified that are present in the genomes of diverse harbor multiple diverse families of these retrotranspo- eucaryotes (DOOLITTLEet al. 1989; XIONG and EICK- sons. In cotton, for example, nine distinct lineages of BUSH 1990). These retrotransposon classes are distin- Tyl/c@ia groupelements were identified from the guished by whether or not they are flanked by long analysisof 89 partial reverse transcriptase sequences terminal direct repeats (LTRs) and are simply referred (VANDERWIELet al. 1993). More recently, analysisof to as the LTR and non-LTR retrotransposons. The LTR sequences in the DNA databases have revealed Tyl/ retrotransposons are further composed of two distinct copia retrotransposon insertions adjacent to 21 plant genes, further indicating the abundance of these ele- Correspmding author: Daniel F. Voytas, 2208 Molecular Biology ments in plants (WHITEet al. 1994). Although less well Building, Iowa State University, Ames, IA 50011. documented, examples of plant Ty3/gypsy group ele- E-mail: [email protected] ments have also been reported (SMMYTHet al. 1989; Pu- 'Present address: Laboratorium Genetika, Universiteit Gent, Lede- ganckstraat 35, E9000 Gent, Belgium. RUGGANAN and WESSLER 1994), as have two examples 2Present address: Transkaryotic Therapies, Inc., 195 Albany St., Cam- of non-LTR retrotransposons (SCHWARZ-SOMMERet al. bridge, MA 02139. 1987; LEETONand SMYTH1993). Genetics 142 569-578 (February, 1996) 570 A. D. Wright et al. Retrotransposons have influenced plant genome or- dard laboratory strains; Landsberg carries the recessive erecta ganization by contributing to genome size. This occurs mutation. DNA manipulations: Standard methodswere used for DNA because retrotransposition is a replicative process and manipulations, including the purification of plant, yeast and unlike the DNA transposons, retrotransposons do not plasmid DNA, the preparation of Southern filters, the screen- excise. Rather,a single insertion can be transcribed ing of yeast artificial chromosome (YAC) and lambda phage toyield mRNA templates, which in turn are reverse libraries, and enzymatic manipulation of cloned DNAs (Ausu- transcribed to generate numerous progeny elements. BEL et al. 1987). All filter hybridizations were conducted at 65"by the method of CHURCHand GILBERT (1984). DNA The extent to which retrotransposons can contribute fragments used as hybridization probes were separated on to genome bulk is well exemplified by lily, which has low melting agarose gels; gel slices containing desired DNA one of the largest characterized plant genomes (BEN- fragments were excised, melted and used directly for radiola- NETT and SMITH1976). Two retrotransposon families beling by random priming (Promega). have been identified in lily, the dell and de12 elements, The presence of' Tal 1-1 was initially suggested from the se- quence of' cosmid clone 4711, which was isolated from a which have -13,000 and 250,000 copies, respectively Co- lumbia genomic library (GIKAUDATet al. 1992). To obtain the (SENTRYand SMV~H1985; SMYTHet al. 1989). The de12 entire Tall-I insertion as well as flanking sequences, a Colum- elements alone account for -4% of the lily genome. bia YAC library (GRIIL and SOMERVILLE1991) was screened This amounts to -lo9 base pairs or 10 times the size using cosmid 471 1 as a hybridization probe. A single clone was of the entire A. thalinna genome (SMWH et al. 1989; identified (EG15A3), from which a 7.2-kb BgnI fragment was subcloned that was predicted to span the 5' end of Tall-I SMYTH1991). (Figure 1).Because Southern hybridization analyses suggested While the A. thaliana genome is small and lacks abun- that Landsberg did not carry a Tall-1 insertion, the empty dant interspersed repeats,it nonetheless carries numer- target site was cloned using a Landsberg genomiclambda ous diverse Tyl/ cqia group retrotransposons (VOU~AS phage library (VOWAS et al. 1990); hybridization probes in- and AUSUBEL1988; VOITAS et al. 1990; KONIECZNYet al. cluded a 1.1-kb EcoRI/HzndIII fragment 3' of Tall-1 (probe A, Figure 1) and an -0.8-kb BgllI/SuZJ fragment near the 5' 1991). We have previously identified and characterized end of the insertion (probe B, Figure 1). The library screen 10 such families that are comparable in number and yielded one phageclone with probeA (XDWl) and three diversityto element families in species like cotton, unique clones with probe B (hDW2, XDW3, ADW4). XDWl did which have considerably larger genomes (lo8bp us. 5 not hybridize to phage isolated with probe B. This data, as well X lo9 bp) (KONIECZNYet al. 1991; VANDERWIELet al. as additional DNA sequence of Tall-I, suggested that YAC EG15A3 was a chimera,with thejunction residing at the BamHI 1993). Unlike their counterpartsin other plants, the A. site in Tall-l (see RESIILTS). The bonpjde Tall-I insertion, thalzana elements have not amplified to appreciable therefore, was cloned froma Cohmbid genomic lambda phage copy numbers. Each element family is represented by library (kind gift of J. MULLIGANand R. DAVIS).The library one or few insertions. was screened using a 1.8-kb Hind111 fragment from XDWl adja- Here we present evidence for the presence of a class cent to the site of Tall-I insertion (probe D, Figure 1). assay for reversetranscriptases: Two conserved of retrotransposons not previously described in A. thali- PCR amino acid sequence domains shared among non-LTR retro- ana-the non-LTR retrotransposons. One such ele- transposon reverse transcriptases were used to design com- ment, Tall-1, was identified through analysis of DNA pletely degenerate oligonucleotideprimers (DV0144 = sequences flanking the AB13 gene. Sequence similarity GGGATCCNGGNCCNGAYGGNWT and DV0145 = GGA- between Tal 1-1 and related retrotransposons provided ATTCGGNSWNARNGGRYMNCCYTG, where R = A + G; Y the basis for a polymerase chain reaction assay, which =C+T;M=A+C;S=G+C;W=A+T;N=A+G + C + T). Amplifications were carried out in 50-yl reactions was used to characterize the