<<

Copyright 0 1992 by the Genetics Society of America

The Structure and Evolutionof Subtelomeric Y‘ Repeats in Saccharomyces cerevisiae

Edward J.Louis’ and James E. Haber Rosenstiel Center and Department of Biology, Brandeis University, Waltham, Massachusetts 02254-91 10 Manuscript received September 25, 199 1 Accepted for publication March 28, 1992

ABSTRACT The subtelomeric Y’ family of repeated DNA sequences in the yeast Saccharomyces cerevisiae is of unknown origin and function. Y’s vary in copy number and location among strains. Eight Y‘s, from two strains, were cloned and sequenced over the same 3.2-kb interval in order to assess the within- and between-strain variation as well as address their origin and function. One entireY’ sequence was reconstructed from two clones presented here and apreviously sequenced 833-bp region. It contains two large overlapping open reading frames (ORFs). The putative protein sequences have no strong homologies to any known proteins except for one region that has 27% identity with RNA helicases. RNA homologous to each ORF was detected. Comparison of the sequences revealed that the known long (Y’-L) and short (Y’-S) size classes, which coexist within cells, differ by several insertions and/or deletions within this region. The Y’-Ls from strain Y55 alsodiffer from those of strain YPl by several short deletions in the same region. Most of these deletions appear to have occurred between short (2-10 bp) direct repeats. The single base pair polymorphisms and the deletions are clustered in the first half of the interval compared. There is 0.30-1.13% divergence among Y’-Ls within a strain and 1.15-1.75% divergence between strains in the interval. This is similar to known unique sequence variation but contrasts with the 8-18% divergence among the adjacent subtelomeric repeats, X. Subsets of Y‘s exhibit concerted evolution; however, more than one variant appears to be maintained within strains. The observed sequence variation disrupts the first ORF in many Y’s while most of the second ORF including the putative helicase region is unaffected. The structure and distribution of the Y’ elements are consistent with having originated as a mobile element. However, they now appear to move via recombination. Recombination can account for the homogenization within subsets ofY’s but does not account for the maintenance of different variants.

EPEATED sequences are found in all genomes. One class of repeated sequence families are those R They can be tandemlyarrayed and/ordispersed found adjacent to the of many eukaryotic and can exist in few to many thousands of copies [see chromosomes. These sequences are arranged in a ARNHEIM(1983) for review].Many repeated se- complex mosaic of different repeats. In general, no quences haveknown functions suchas rDNAs but function has been attributed to these sequences (ZAK- many others are ofunknown function. There is a IAN 1989). No homologies are found among the general observation of concerted evolution within a known associated sequences of different or- repeated sequence family. Within-population and - ganisms, in contrast to the functional and sequence speciesvariation among repeats is much less than homologies of the G-rich telomere repeats (ZAKIAN expected for independent evolution and is at variance with the observed divergence of sequences between 1989). Their origins are also unknown, though the populations and species (ARNHEIM1983; BALTIMORE line-like nature of telomere associated sequences in 1981 ; DOVER1982; LEIGHBROWN and ISH-HOROW- Drosophila is consistent with a mobile element origin ITZ 198 ;1 OHTA 1983; SELKERet al. 198 1; SLIGHTOM, (VALGEIRSD~TTIR,TRAVERSE and PARDUE 1990). BLECHLand SMITHIES1980). This concerted evolu- The chromosome end in Saccharomyces cerevisiae tion is usually attributed to recombination (including (see Figure 1) consistsof the functional telomere unequal sister chromatid exchange, reciprocal ex- which is comprised of (G,.sT), repeats. Adjacent to change and )among members leading this sequence are up to 4 tandem copies of the Y’ to homogenization. Observed homogeneity could also element. This element is highly conserved (based on be due to the rapid turnover of sequences, via trans- restriction maps and hybridization) and is found at position, with loss of diverged copies via segregation some but not all chromosome ends (CHAN andTYE (SELKERet al. 198 1). 1983b; SZOSTAKand BLACKBURN1982; WALMSLEYet ’ Current address: Yeast Genetics, Institute of Molecular Medicine, John al. 1984). Internal to these is another less well defined Radcliffe Hospital, Oxford OX3 9DU, England. repeated element, X. X sequence is found in the sub-

Genetics 1111: 559-574 uuly, 1992) 560 E. J. Louis and J. E. Haber

pBR OPI SUP! I UUA3 2 kb c”---)

X SA 8 R AX A

I I 9YPI-LI FIGURE1 .-The chromosome end in yeast and pYPI-ARC I I XA SA 8 R P AXA the structureof the Y’ clones. The chromsome end inyeast consistsof telomere sequences (G1.>T), represented by 0, which are preceded by up to four tandem copies of Y’ (shaded). These are then preceded by other repeated elements including X X SA 8 R P AXA before sequences unique to that chromosome arm are reached. The telomere sequences are found 1 I pYPI-L3 between copies of Y’s and sometimes between Y’s and the adjacent X element. Characteristic restric- X S A B R P AXA tion sites include Xhol (X), Sal1 (S), Asp718 (A), BamHI (B), EcoRl (R) and Puul (P) which are shown for the nine clones. The two size classes of Y’s (Y‘- L and Y’S) differ in region between the Sal1 and XA 55 R P AXA EcoRI sites. Individual Y’s were marked with pBR322, SUP11 and URA3 using the sequence shown (striped). This Y’ sequence, used to mark I I pYP 1-5 1 the Y’s, from Puul to a Sac1 site originated from X SS R P AXA another strain (HOROWITZand HARER1984). Xhol I fragments of these marked Y’s were then cloned I directly into E. coli. The size and extent of the nine I I py55-s I clones are indicated and the regions sequenced and reported here are underlined. 8 S AS8 R P AXA a-/6kb/- I I pY55-LI

X S SB R P AXA

telomeric region of all chromosome ends. The GI.9T clones from a third strain (CHAN and TYE1983a). telomere sequences are sometimes found between X Additional types of Y’s not found in all strains include and Y’ and between tandem copies of Y’s (WALMSLEY a degenerate form (LOUISand HABER 1990b) found et al. 1984), however, not all X-Y’ junctions contain in strain Y55 that has the internal half but not the telomeresequence (LOUIS and HABER1991); E. J. telomere adjacent half of Y’ and an extra long form LOUISand J. E. Haber, in preparation). Internal to found in strain YPI (our unpublished observation) the X and Y’ elements are other repeated sequences that has anapparent insertion within 1 kb of the (E. J. LOUISand J. E. HABER,in preparation) which telomere. The Y’ elements vary in copy number and are finally preceded by sequencesunique to each location between strains and exhibit apparent con- particular chromosome arm. certed evolution based on restriction site and frag- To study the short andlong term effects of recom- ment length variation (LOUISand HABER 1990b). bination on a repeated sequence family, we have ex- Recombination involving individual Y’s was also amined the telomere associated repeated sequence, studied. Reciprocal exchanges between dispersed cop- Y’, in S. cerevisiae in detail. Copy number, location ies of y’s do not result in lethal rearrangements as and restriction site variation in two strains have been they simply exchange one telomere for another. Ec- characterized (LOUISand HABER1990b). The major- topic recombination (between copies at different ge- ity of Y’s fall into oneof two size classesfirst described nomic locations) was observed as well as the expansion by CHAN andTYE (1 983a). These will be designated of single copies into tandem arrays via unequal sister- here as long(Y’-L) and short(Y’-S) due to an apparent chromatid exchange (LOUISand HABER 1990a).Ac- insertion/deletion difference. These two size classes quisitions and losses of Y’s from individual chromo- coexist within strains with longs usually outnumbering some ends were also observed. These recombinational shorts. There are 19:6 and 7:3, longs:shorts, in the interactions were nonrandom in the sense thata two strains studied here and 7: I among independent marked element of one of the two known size classes, Yeast Subtelomeric Repeats 56 1 Y’-L and Y’-S, tended to recombine with other mem- a marked Y’ (LOUISand HABER1990a). pYPl-LP is the bers of its size class. Transposition could not account terminal Y’ of a tandem pair that resulted from a marked single Y’ from one end of chromosome XVI moving to the for the majority of events which were all consistent distal end of a resident single Y’ on another chromosome with beingrecombinational (LOUIS and HABER (ZX). The other two resulted from the recombinational 1990a). Transposition would result in a gain of a Y‘ transfer (either reciprocal exchange or gene conversion) of at the recipient location; however, in most cases there the marker sequences to unmarked Y’s at another chromo- was no change in Y’ copy number at therecipient end some. pYPl-L3 is the result of a recombination event be- tween a marked Y‘ (from chromosome V or VZZZ) and an (see DISCUSSION). The recombinationalinteractions unmarked single Y’ (at chromosome X). pYPl-L4 is the could account for the observed variation and distri- result of a recombination event between a marked Y’ (from bution of Y’s seen among different strains(LOUIS and chromosome V or VZZZ) and an unmarked single Y’ (at HABER1990a). This system shouldprovide for an chromosome ZX). understanding of some aspects of repeated sequence The sequence of one Y’ and its adjacent sequences (re- constructed from pYP1-ARC and pYP1-L1 and the previ- evolution in general as well as for telomere associated ously sequenced PuuI to SacI region (HOROWITZand HABER repeats. 1984)) was determined by di-deoxy sequencing (SANCER, A comparison of different Y’ sequences fromwithin- NICKLENand COUUON1977) using Sequenase version 2.0 and between-strains can assess how concerted Y’ evo- according to manufacturer’s instructions. In addition, a 3.2- lution has been andaddress several other issues. kb interval for the five other Y’-Ls and the equivalent 2.0 kbin the two Y’-Ss were sequenced. The Y’-Y’ junction There is apparent restriction site homogeneity within sequences in pYPl-LP and pYP1-S1 were also determined. strains consistent with concerted Y’ evolution (LOUIS These sequences have been submitted to GenBank and have and HABER1990b). Is the observedrecombination accession numbers M58718, M58719-M58725, M59468 leading to the predicted within strain homogeneity at and M59469 for the entire Y’ of pYPl-Ll/pYPl-ARC, the the sequence level? Subclasses of repeated sequence internal interval of pYPI-L2, pYPI-L3, pYPl-L4,pYP1-SI, pY55-L2, pY55-L1, pY55-S1 andthe Y’-Y’ junctions of families can be expected to diverge from each other pYPl-L2 and pYP1-S1,respectively. The Y’ cloneswere if they accumulate enough differences to prevent ho- subcloned into pGEMSZf(-) and synthetic oligo 15-mer mogenization (WALSH1987). Are theknown size class primers were used. In addition, nested deletions were cre- variants diverging from each other as might be ex- ated in some regions using lambda Ex0111 and mung bean pected for reduced recombination? Whatis the nature nuclease (SAMBROOK,FRITSCH and MANIATIS 1989). The previously sequenced region (HOROWITZand HABER1984) and origin of the known size class variants? The se- used for transplacement was not entirely resequenced. All quence may also provide information regarding the other regions presented were sequenced on both strands. origin and possible function of Y’s. Ambiguities were resolved by the use of dITP (Sequenase version 2.0 manual). MATERIALS AND METHODS The GenBank data set (release version 66.0) was searched for homologiesusing the programs available from GCG Cloning and sequencing of Y’ regions: The entire Y‘ (Genetics Computer Group)(DEVEREUX, HAEBERLI and SMI- families in the two divergent strains, YPl and Y55, have THIES 1984). Additional analysis and graphic representation been characterized in terms of copy number, location and of data was obtained using DNA Strider (Commisseriat i restriction site variation (LOUISand HABER1990b). The use 1’Energie Atomique, France). Both the DNA sequence and of markers selectable in both yeast and al- putative protein sequences were used in the searches. lowed for the direct cloning of several Y’s from YP1 (LOUIS RNA analysis: Total yeast RNA was isolated from cul- and HABER1990b). The same approach was used to clone tures of Y55 and YPl (SHERMAN,FINK and HICKS1986). A several Y’s from strain Y55. Nine clones(see Figure 1) preliminary northern analysis failed to reveal any Y’ mes- representing 8 different Y‘s, both Y’-L and Y‘-S, were ob- sage. A more sensitive RNAse protection assay (LEE and tained from the two strains. The sequence from the PuuI to COSTLOW1987) was used to detect the presence of message SacI sites (see Figure 1) in each Y’ is not representative of and to check for possible pre-messenger RNA processing. the resident sequence as this region was used in the trans- Subclones in pGEMSZf(-), containing open reading frame placement of markers into the Y’s. This Y’ sequence in the regions, were used to prepare “P-labeled antisense RNA transplacement plasmid is from a different strain and has probes (see Figure 2A) (MELTONet al. 1984). RNase protec- been sequenced (HOROWITZand HABER1984). These clones tion of mRNA:antisense RNA hybrids using Ambion’sRPA contain Y’ sequences from the telomere adjacent XhoI site kit (Ambion, Inc., Austin, Texas) was carried out according to the next XhoI site in the adjacent Y’ (pYPl-L2 andpYP1- to manufacturer’s instructions. S1) or the adjacent sequences of particular chromosome ends (Figure 1). Five of the clones represent resident Y’s (pYPl-L1 from chromosome VZZ or XV (these chromosomes RESULTS comigrate in contour-clamped homogeneous electric field (CHEF) gels),pYP1-SI from chromosome XZZ, pY55-L1 One entire Y’ has twolarge overlapping open from chromosome XZV, pY55-L2 from chromosome XW, readingframes (ORFs): The two clones pYP1-Ll and pY55-S1 from chromosome VI). pYP1-ARC is a circular and pYPI-ARC can be used to reconstruct the entire Y’ derived by meiotic “popout” recombination involving the Y’ element, except for the PvuI to SacI region (bases same marked Y’ as clone pYP1-LI (LOUISand HABER 1990b). Thismeiotic “popout” event also produced a Y’-less 4829 to 5657), and its adjacent sequences from one chromosome end. The three other clones were obtained end of chromosome VZZ or XV. The ORF map of the from recombinant Y’s after selection for the duplication of sequence is represented in Figure 2A andthe se- 562 E. J. Louis and J. E. Haber A

1000 2000 3000 4000 5000 6000 7000

Frame 1 Frame 2 Frame 3 FIGURE2.-Open reading frame I \ map and general structure of Y’s. A, TATAGTATGT TACTAAC RNA AG heliise ? ARS The ORF map for the entire Y’ ele- ment (pYP1-Ll, pYPI-ARC and the PvuI to Sac1 region) and its adjacent B sequences is shown. Short lines indi- GTATGT TACTAAC AG RNA AG TACTAAC GTATGT heliiase? cate start codons while tall lines in- dicate stop codons. The potential 2000 3000 splic-4000 as well transcriptionalas signals ing signals are shown. The region potentially encoding an RNA heli- case is underlined. B, The region se- quenced in all eight Y’s is repre- sented. The black indicates se- D- B A- quences missingthe relative to Y’-Ls C of YPl. The V indicate single base V other theinsertions relative to Y’s. pYP1-Li I / ...... The shading in the secondof half the interval represents the relatedness of pYP1-L2 the sequences.one The Y’s fall into of two sequence types indicated by PYPl-LS the shading. The lines labeled A-D represent the antisense RNA probes used in the RNAse protection assay pY P1-L4 (see Figure 5). /////////////////////////////

pY55-L1

pY55-L2 quence of the Y’ is given in Figure 3. Bases 1-232 are are not well defined there arepotential start (TATA) the endof the adjacent X element. Comparisonof this and termination (ZARATand SHERMAN 1982)signals sequence with other X elements is given elsewhere within 100 bp prior to the first ATG of ORFl and (LOUISand HABER1991). Bases 233 to 3 16 are the after the stopcodon of ORF2, respectively. The TAC- internal (G1.sT), telomere sequences sometimes found TAAC recognitionsequence for intron splicing between Xs and Y’s. The 6654-bp Y’ element is from (LANGFORDet al. 1984) is found at 2595 and is pre- base 3 17-6970 and is followed by an additional 1 16 ceded by the 5’ signal (GTATGT) atbases 2332 and bp of telomere sequence. 2358 and is followed closely by the 3’ signal (AG) at A search for ORFs reveals two large ORFs as rep- 2628 and 2648. Use of the closest pair of signals to resented in Figure 2A. ORFl is from 1171 to 2880 the TACTAAC site brings the two ORFs in frame and ORF2 from 2646to 6560 (thefirst ATG in ORF2 though there is a stop codon near the splice junction. is at 2886). ORF2 is -1 bp out of frame with respect Structural features of the element include the pre- to ORF1. Analysis of the codon usage [TESTCODE viously described tandem 36bp repeats found at 4930 (FICKETT1982)], indicates that most of the Y‘ se- to 5361 (HOROWITZand HABER 1984)and the ARSl quence is likely to be actual coding sequence. How- element (SHAMPAY,SZOSTAK and BLACKBURN1984; ever, thecodon adaptation index(SHARP and LI 1987) WILLIAMSON1985) found at 6749. There is another for yeast sequences is low for each of these ORFs tandem array of three T(T/G)AGGGCTAT repeats (0.089 and 0.104 for ORFl and 2, respectively). Al- found at 6828 to 6857 thatcontains two copies of the though the transcription regulationsequences in yeast vertebratetelomere sequence TTAGGG. Several Yeast Subtelomeric Repeats 563

1 GAATAGCAGGGTAAGGTGGGTAGTGGAGGTTGGATATG~AATTGG~GGT~CG~TATG~ACGATGGGTTGGTG~~~GTAGA~GATGGAT

101 GGTGGTTGGAGCGGTATG~AAGGGGACA~T~GA~GG~GGT~~~TGGA~~~GTT~AGACATGT~TC~TAGGGT~MTAG

201 GGTAGGGnAGGGTAGGGTTAGGGTAGTGTTAGGGtgtgggt 301 stgggtstsstgtgg~ATATATATGTCACTGTATTGATGCTGGATG~~AGA~GC~~GG~ATAT~~~TA~~MCCTT~~GpYP1-L1 """""""""""""""""""""""~ pYPl-LZ _"""""""""""""""""""I ""_ pYP1-S1

401 AAAATAGGCAATATTICCTG~AGGCGATTG~;ACGCAGAApYP1-L1 ""_"""""""""""""""""""L pYPl-L2 """"""""""~"""""""""""~- pYP1-s1

501 AGTAACTTGATACGTCGTGG~GAATGG~~~TCTTA~GGCGGGGTAATACAT~GGGG~TTTG~TG~TG~~~~TATGT~T 601 ACGCAAAAAGGGTC~CTTC~GGGAGGTC~ATACCTAAT~TGT~~AT~~TATTATATTG~~~~GCTAGGGA 7 01 AGAAGTTGTAGGCTAAGCGGT~CGTAGGTC~TA~AAA~A~~~TATC~CG~~GG~GAGC~~~~TCCTG~T

801 CC'PCGACTAAGCAGATAGTTAAGATACTAAGATACTGTGCACCATGGG 901 AATGAGACATCCTTCTGmTCTATG~~GA~GA~GTCG~TATCTTAGTGAGATT~~ATTAACTGAA~~GTGCT~TGGAGA~C 1001 ACCTGCATAGCGCAGATTCGTT~TCAATAGAGT 11 01 TGATGGAmCTTGTCAAAAAGCATAACAATCAACATACTAT 1201 AAAGCAAACTTTGACGAGTTTG~~GG~CTAAATAACAC 1301 GGTCATTCTACGAAGACGAAAAGTCTGGCCTAAT~~~TAAAATTC~TGGTGCAA~GATA~AAAAGGTC~~~TCA~C 1401 CGTCATGGTCGGGAAAAATGTACAAAAGTTCCTGACATTI 1501 AAAATCAACTTGATGGTCTACACGTTGrnCAAGTGCATACTTT~TTCAATA~GGATTACGAT~CC~~~AC~~CAGAG~T 1601 ACTATAATGA~TGAGTTTCGTGTCCTGGAACGTTGTCACG~TAGCGAGTGC~GGCC~~~GCTCTACGATGC~AC~ACTGAC~T 1701 TTCTGGCGCACCTATTGTAAGGAGTCTTCAGAAAAGCACC pYP1-L1 1"""""" T """"""_ pYPl-L2 """"""""T"""""""" """"""""T"""""""" pYPl-L3 """"""_ """"""_ T""""""_ pYPl-L4 ------~:-----T------pYP1-s1 """""""""""""~pY55-Sl """G"- """""_ pY55-Ll ""-G"- "~""""pY55-L2

1801 CTATCGATmTTCTGCATACCAAGCMGTTT~CTGGC~~GTC~ACAGAGC~CT~CGTGATCTAT~CCAC~~CCA~~ApYP1-L1 """""""""""""""""""""""""""""""""""""-c"--"" """""""""""""""""""""""""""""""""""""-c"--"" pYPl-L2 """"""""""""""""""""""""""""""""""""~pYPl-L3 """"""""""""""""""""""""""""""""""""""" pYPl-L4 """""""""""""""""""""""""""""""""""""""~ pYPl-Sl """"""""""~"""""""""""""""""""""""~"""""" pY55-Sl """""""""""---"""""""""""""""""""""""-c"--"" """""""""""---"""""""""""""""""""""""-c"--"" pY55-Ll """"""""""--c"--""""""""""""""""""""""--c"--"" pY55-L2

1901 TAAA~ACTITTCT~GATATGATGAAC~GACCGATTGGGTGATA~TG~TATTAT~CG~~~TGCG~~TTTCGGpYPl-Ll """"""""""""""--G"--""""""""""""""""""" pYPl-L2 """"""""""""""""-""""""""""""""""-A"""" pYPl-L3 ------G--"------A------pYPl-L4 """""""""""""""G---"""""-~""- -G pYP1-s1 ""L""""""""--G"--"""""-G""- -G pY55-S1 """""""""""""""-G"--"""""""""""""""""""""" """""""""""""""-G"--"""""""""""""""""""""" pY55-Ll ------G-"------A-----G--- pY55-L2

2001 GGCGGGTCCCCGTGGTGGCGTG~GACGAAGAGGATCGTpYP1-L1 """"""""""""""""-c"""""""""""""""""""""" pYPl-L2 """"""""""""""""""""""""""""""""""-~""""" pYPl-L3 """"""""""""""""""""""""""""""""""" ~"""pYPl-L4 DYPl-Sl pY55-Sl ------T------~------"_ pY55-L1 ------T------~------"_ pY55-L2 2101 AAGTTGGTAGTCCTAA~~CAC~GACTCA~C~T~TGCC~~GC~A~GT~CG~TTGTGCTT~~TGCAATAGGGGA~AGpYP1-L1 """c""""""""""""""""""""""""""""~""- ""_ pYPl-L2 """c"""""""""""""""""""""""""""""-~-" ""_ pYPl-L3 """c"""""""""""""""""""""""""""-c"--"- ""_ pYPl-L4 pYPl-Sl pY55-Sl """-C-""" """-C-""" <-- """""""""""""-""-c-""" """ pY55-Ll """-C-"-""<" """"""""""""""""c---""- ""_ pY55-L2 2201 ATATATTTGATGACACCAACGGCG~GAATGTG~TGGA~~TTCTGT~~CGMGTAGC~~~CCACG~~~~~ATA~TpYP1-L1 ------A------G------G------pYPl-L2 """"""A"""""""""""""""""""""""-""""""" pYP1-L3 """"""A"""""""""""""""""""""""""""""" pYPl-L4 pYPl-Sl pY55-Sl ------A------G------pY55-Ll ------A------G------G------pY55-L2 FIGURE3.-The sequence of one entireY' and its relationship to other Y's. The entiresequence of the 6654 bp Y' element reconstructed from pYP1-Ll, pYP1-ARC and the PVuI to Sac1 region along with its adjacent sequences are displayed. The sequences of the other 7 Y's over the same interval are shown. Blanks indicate a missing base whilea - indicates identity with pYP1-Ll/ARC. The Y'-Y' junction sequences of pYPI-L2 and pYP1-SI are also shown. The underlined sequences at the endpoints of the insertion/deletion differences indicate the amount of flanking direct repeat sequence. (Sequence continued on pages 564-567.) 564 E. J. Louis and J. E. Haber

2301 AACmGGTACCTTPTCTTCTTATATC~TATGTG~~TAATCGCGAGTATGTCCGCGG~~~~TG~~TC~TT~ pYPl-Ll """"""_-G """""4" """""""""_"" T" pYP1-LZ """""""""""""" T""""""""""""" T" pYPl-L3 """"""""""""""T"""""""""""""~ T" pYP1-L4 pYP1-s1 pY55-Sl ------T------G------DY55-LI ______I______II______------G------PY55-L2 2400 TAACCTTTGGGGAGAGTTGAACAACTGCTTl"rAT~TA~~GGTTGATA~GCCT~~GCGT~TCG~M~~~CCApYPl-Ll """_ """_ A--A---G ______l__ll_l___l______--- """"""""""""""""A """"""""""""""""A pYPl-L2 """"""""""_ """"""""""_ A-G pYPl-L3 """"""""""""""""""A """"""""""""""""""A """""""""" A-G pYPl-L4 pYPl-Sl pY55-Sl "G""""""" "G""""""" cc-- "" """"_ "_ """"_ pY55-Ll -G"""""""""-cc" """"""""""""- pY55-LZ 2500 AAGCGAGGAATn;ACG~C~TC~~~GATACCTGTI~~~GT~~TGCTGCC~TT~GAGATA~ApYPl-Ll 1""""""- G ""_ C """""""""""""""""pYPl-L2 """""""""""""""""""""""""""""" pYPl-L3 """"""""""""""""""""""""""""""pYPl-L4 """""""_ """""""_ pYPl-Sl I""" """"_ PY554.1 """"A"""""" """"-c"""""""""""-""" DY55-L.1 """-A""""""- """< """"""""" "G "" py55-LZ 2600 ACAAAATGACCGCGGCTCrTAAAAATggAGTWLCTGT;r( oYPl-Ll """"""""_""""""""""""" """"""""_""""""""""""" *"""""" pYP1-L2 -__I------______G___-____--_-______A____l_____ll____ pYPl-L3 ------G------A------pYPl-L4 """"""~"-" """""""- ""-G"" pYP1-s1 """""""c""- """""""" ""-G"" pY55-Sl """""""""G""-" """-G--"-----"G """G"" pY55-Ll """""""""-G""""- """-@""""-G ""G"" pr55-L2 2700 CGITATCCAGAGCTI~TA~CGCT~CC~~~~GCAACGTATACG~~~C~TCC~CTGT~~~~pYPl-Ll """"""""""""""""""""""""""""""""""~pYPl-LZ """""""""_""""""""""""""""""""""""""""""" pYPl-L3 """"""""""""""""""""""""""""""""""""" pYPl-L4 """"" A """"""""- -c-- "" pYPl-Sl """""A """""""""- -c-- "" pY55-Sl """"- A """""""" -c- "" DY55-LI """"_ A """""""" -c" "" pY55-L2 2800 ~GGTI~TCGGAGCCTCGACTTAAGACGCTTGACG~CT~ATTACG~TITTA~GT~CTGTGCTAAGGC~~TATG~CDYP~-L~ """""""""""""""""""""""""""""""""""" pYP1-LZ ------A------C------pYPl-L3 ------A----"----C-"--C------pYPl-L4 """"""""""A-""""""" pYP1-S1 """"""""-A"""""""- pY55-Sl """"""""-A"""""""""""""""""""""""""" pY55-Ll """"""""""-A""""""""""""""""""""""""""" pY55-LZ 2900 GCTTGGT~~ATGAC~TITIAA~TGA~CTTGpYPl-Ll """"""""""""""""""""""""""""""""""pYPl-L2 """"""""""""""""""""""""A"""""""""""" pYPl-L3 ------A------pYPl-L4 Dwl-sl pY55-Sl """" """""""_""""""""""""""""""""~ pY55-Ll """" """""""_""""""""""""""""""" pY55-LZ 3000 GTTITGTATTCCTACATGmCn;AATACCGC~GGG~GT~GGTI~TAC~AACT~CG~~TA~TGAGGGAA~CGAA~TGC~~CpYPl-Ll """""""""""""""""""""""""""""""""""~pYPl-LP """"""""_""""""""""""""""""""""""""""" """"""""_""""""""""""""""""""""""""""" pYPl-L3 """""""""""""""""""""""""""""""""""""~pYPl-L4 pYPl-Sl pY55-Sl """""~"""""""""""""""""""""""""""""pY55-Ll """""""""""""""""""""""""""""""""""""" pY55-LZ

3100 AGAAGCTGAATmCGGGAGATGCGTCAGGGGTPWLTTGCCCTAGGACGG~CTGCGT~GTA~~A~~~GATTTGTACGAG~GGCGACGAGpYPl-Ll """"""""""""""""""""""""""""""""" pYPl-LP """""""""""""""""""""""""""""""""""""pYPl-L3 """""""""_""""""""""""""""""""""""""""" """""""""_""""""""""""""""""""""""""""" pYPl-L4 pYPl-Sl pY55-Sl """""""""-A"""""""""""""""""""""- pY55-Ll """""""""A"""""""""""""""""""""""- pY 55-1z

3200 TGAACTCATGGCCAATCATTCCGTTCAAACAGGGCGAAAT DYP~-L~ """_""""_"""""""""""""""""""""""""""- PYPl-LZ """"""""""""""""""""""""""""""""""~pYPl-L3 ~""""""""""""""""""""""""""""""""""- pYPl-L4 pYPl-Sl pYSS-Sl """"""""""""""""""~""""""""""" pY 55-11 """""_"""""""""""""""""""""""""""~ pY55-L2 FIGURE3. Yeast Subtelomeric Repeats 565

3300 CAGGAACGAG~CG~;GA~TG~AGGCC~~GCGA~~~~GT~~GAATTCTAGT~GAATTC~~~~GTA~GG~GpYPl-Ll """"""""""""""""""""_ """"_ """ pYP1-L2 """"~""""<""""""""""" <"""""""" pYPl-L3 """"""""~<""""""""""<""""- ""_" pYPl-L4 """"""""""___I """""""""" ""_ pYPl-Sl """""""""_""""""""""""""" pY55-Sl _""" _""" -G """""""_""" <"""""""" PY55-Ll """-G_"""""""""" <"""""""" py55-L2 3400 GTGAGGCGGCGAGlTCAGATCATGATCAAAAAATTTCAAW pYPl-Ll """"""< """"""1""" -""""""""_" pYPl-LL """""A""C """"""""""""""""""""~pYP1-L3 """""A"< """"""""""""""""""""pYPl-L4 """""""""""""""""~""""""""PYPl-Sl "" "- ""- <""~"""""""-"""""~" ""_" PY55-Sl """"""_ < """""""""1 -"""""""""_ pY55-Ll """""""~<""""""""""" -"""""""""_" PY55-L2 3500 WVIACTCmGGCAGCTCCT~~TGAATTCAG~~TTG~CAGTn;CGCTTATGTCAT~TAT~ATGG~~CA~CC~~~~T~GCCpYP1-Ll """"""""""""""""_ """ ""_" """""_" pYPl-LZ """"""""""""""""""T"- """"""""""""""""""T"- """""""""" pYPl-L3 ------T------pYP1-L4 _""""""""""""""""""""""""""~"~""" _""""""""""""""""""""""""""~"~""" pYP1-s1 """""""_""""""I"""" ""_ "1 "I-" """_ pY55-Sl ------T------pY55-Ll "_"""""""""""""""" "_"""""""""""""""" T"""""""""""~" PY55-L2

3600 CCACCGGGCTATGGTAAGACGGAGTTAT'ITCATCTCCCCA pYPl-Ll "_""""""""""""" "_""""""""""""" -""" """""""""""pYPl-LZ ______------__------_-I----A-".------G _------____ pYPl-L3 ------A------G------pYPl-L4 """"""""""""""""""""""""""""""""""~pYP1-s1 """""_"""""""""""""""""""""""""""" PY55-S1 ------A------pY55-Ll ------A------pY55-LZ 3700 CAGTGTn;CTn;CT~~GCATGAGAATTCAGGTTGAGCCGATGCG~CTTG~TG~GCCCC~TAAG~CTTTA~~GG~GAn;GCG~ACPYPl-Ll """""""""""""""""""""""""""""~""""""""" pYPl-L2 ------G---C------A------pYPl-L3 ------G---C------A------pYPl-L4 ------G------owl-S1 c--- " """"""""""""""""""""""""""""""""""""~PY55-Sl ------G-----C------A------pY55-Ll ------G-----C------A------pY55-L2 3800 TGAT~ATACGTGGGGATACGA~AT~CTAGCACTAA~GAATTCACAGACAGGATA~~GTGGGA~TA~~GA~~~~~CC~CpYPl-Ll """""""""""""""""""""""""""""""~"""""""~" pYPl-LP ""_""""""""""""""""""""""""""""""""""""~ ""_""""""""""""""""""""""""""""""""""""~ pYPl-L3 """"""""""""""""""""""""""""""""""""" pYP1-L4 """"""""""""""""""""""""""""""""""""" pYP1-s1 """"""""""""~"""""""""""""""""""""""~""pY55-Sl """"""""""""""""""""""""""""""""""""~""~"~ pY55-Ll ------T------pY55-L2 3900 AACGTAAAATTGGGTTACCGTAGATGAGTTT~~~~~~GTCT~C~~GTCG~~GGGG~TAA~~CC~GA~TTGpYPl-Ll """""""""""""""""""""""""""""""""""~pYP1-L2 """"""""~""""""""""""""""""""""""""""""""~ pYPl-L3 """""""""""""""""""""""""""""""""""""""~" pYPl-L4 ""_""""""""""""""""""""""""""""""""""""""""~ ""_""""""""""""""""""""""""""""""""""""""""~ pYP1-s1 """""""""""""""""""""""""""""""""""""""""""""" pY55-Sl """"""""""""""""""""""""""""""""""""""""""""~ pY55-Ll """""""""""""""""""""""""""""""""""""""""" pY55-L2

4000 ACGCTTTTGAGAAAGCAATCTTTTTGAGCGGCACAGCACAGCCTGAGGC~TAGC~ATGC~CGT~~~GTAT~~CTTACG~~n;GC~~~CpYPl-Ll """"""""""""""""""""""""""""""""""~"""""~ pYPl-L2 """""""""""""""""c--""""""""""""""""""""""""" A-- pYPl-L3 """""""""""""""c----"""""""""""""""""""""" """""""""""""""c----"""""""""""""""""""""" A-- pYP1-L4 """""""""""""""""""""""""""""""""""""""""" pYP1-s1 ""_"""""""""""""""""""""""""""""""""""" ""_"""""""""""""""""""""""""""""""""""" pY55-Sl """""""~"""""~ <"""-"""""""""""""""""" A-- pY55-Ll """""""""""""""c---"""""""""""""""""""" """""""""""""""c---"""""""""""""""""""" A-- pY55-LZ 4100 GATGGACATCAACGAGCTCGGTCGGAAGATCTCAGCG pYPl-Ll ""_"""""""""""""""""""""""""""""""""""""" ""_"""""""""""""""""""""""""""""""""""""" pYPl-LZ """""""""""""""""""""""""""""""""""""""" pYPl-L3 """""""""""""""""""""""""""""""""""""""""~ pYP1" """_"""""""""""""""""""""""""""""""""""""~ """_"""""""""""""""""""""""""""""""""""""~ pWl-Sl """""""""""""""""""""""""""""""""""""""""""" PY55-Sl """""""""""""""""""""""""""""""""""""~""""~ py55-n """_"""""""""""""""""""""""""""""""""""""""" """_"""""""""""""""""""""""""""""""""""""""" pY55-LZ 4200 CCTTTAGGGCATGTTCATAAAAmGGAAGAAAGTGGAATCACAGCCC~~GCAC~~CTGAATTCTTTT~CCCTCT~~TT~~GAGTCGApYPl-Ll """""""""""""""""""""""""""""""""""""""""~ pYPl-LZ ------C------G------pYPl-L3 ------C------G------pYPI-L4 """"~""""""""""""""""""""""""""""""""""~ pYP1-s1 """""""""""""""""""""""""""""""""""""" pY55-Sl ------C------G------pY55-Ll ------C------G------pY55-LZ

FIGURE3. 566 E. J. Louis and J. E. Haber

4300 ~~CA~GT'AGTT~~GCAC~CC~C~GTGGPGCAAGCACAACCAACGAAGTGGAAGcpYPl-Ll """""""~""""""""""""""""~ pYPl-LZ "I"""""""""""""""""""~ pYPl-L3 """""""""_"""""""""""" pYPl-IA ...... PYPl-Sl """""""_ """"""""""""""" PY55-Sl """""""~ """"""""~"_"""""" pY55-Ll """""""- """"""""""""""""""py55-L2 4400 TGCAGAAAAGGTGTCPCGC~~~GGTA~~GAGTTC~C~CGAAATTA~~~~A~~~pYPl-Ll """"""""_ """""""""_"""""~" PYPl-Lz """"""""""""-A"""""""""" """"""""""""-A"""""""""" PYPl-LJ """"""""""""-A"""""""""-""" """"""""""""-A"""""""""-""" pYPl-L4 """"""""""""""""""""""""""~pYP1-s1 """""""""""""""""""""""""""- py55-s """""""""""""""-A """""""""""""""pY55-Ll """"""""""""""""A"""""""""""""""- pYS5-LZ 4500 TTGATGATGGTGATCATCATGCT~;ATAATAGACPTAATATTA pYPl-Ll """ """"""""_""""""""""""""" pYPl-LZ """""""""""""" T""""~"""""_"" pYPl-LJ """""""""""""""" T"""""""~""_"" DYPl-L4 """"""""""""""""""""""""""" PYPl-Sl """""""""_""""""""""""""""""" """""""""_""""""""""""""""""" pY55-Sl """""""""_""""""" """""""""_""""""" T"""""""""""" pY5S-Ll """""""""""""""T"""""""""""" pY55-LZ 4600 GAAAAAACAGTTGGGCGGCAAGGAATCGTAAGGGn;AATT PYPl-Ll I""""""""""""_ """ """""""""" pYPl-LZ """"""""""""A """"""""""""A """"""""""""""pYPl-L3 """""""""""""A """""""""""""A """""""""""""""" pYPl-L4 """""""""_""""""""""""""""""""""""" """""""""_""""""""""""""""""""""""" pYP1-s1 """""""""_"""A ""~"""""""""""""py55-s1 """"""""""""""A"""""""""""""""""- ~Y55-Ll """""""""""""""A""""""""""""""""- PY55-LZ

4700 AAAGAAAGGAAAAAA GGGCCAGCATGTKGATGCTGTGG~C~GACAGACCTGATCTCTGATCC~~AG~~CTGATCAT~TGGACAGATTGGCpYPl-Ll """""_ """""""""_ "_ """"""""""""""pYP1-L2 """"" """""""~""""""""""""""""""""""~pYPl-L3 """"" """"""""""""""""""""""""""""~pYPl-L4 """"_ """"_ """""""""""""""""""""""""""""~pYp1-Sl """"-A""""""""""""""""""""""""""" """"-A""""""""""""""""""""""""""" pY55-Sl 1"""- """"""""""""""""""""""""""""" pY55-Ll "I"""_""""""""""""""""""""""""""" pY55-L2 4799 TGAAAARCAGGCGACA~TTCCATGTCGATWLTTGCGTTACCGT~~CTTCCA~~AG~TAGC~~G~~~TATT~'AGCAGTGATpYPl-Ll """"""""-G"""""""""""""""""""""""" """"""""-G"""""""""""""""""""""""" pYPl-LP ______-_I______4____I______------A__I_-----___ pYPl-L3 """"-""""""-G"--"""""""""""""""""""""""- pYPl-L4 """""""""""---"""""""""""""""""""""" """""""""""---"""""""""""""""""""""" pYP1-s1 """""""""G""""""""""""""""""-"""""" pY55-Sl ---T------G------pY55-Ll ---T------G"------A------A------pY55-LZ

4699 GALGGACAGCGACACGTGATTCATGGTAGTGCTAATGCCA pYPl-Ll """"""""""""""~ pYPl-LZ """"""""""""""""- pYPl-L3 """""""""""""""" pYPl-L4 """A""""""""""""- pYP1-s1 """""""""""""""" pY55-Sl """""""""""""""" pY55-Ll """"""""""""""""" pY55-LZ

4999 TCATGGACTAGTGCTACIACCACTGCCAGCATCAACGTCAGGA 5099 CAACGTCAGGACTAGTGCTA~ACCA~GCCAGCATCAAC

5199 GAAAGTACCGACTCCAACACTAGTGCTACCACCGAAAA 5299 CTACCACTGCTAGCACCAACTCCAGCACT~T~CA~~CA~~GTACC~G~AG~CC~GG~GACGCC~~GATG~~TGATC~GGA 5399 TAATAGATTCCATCCAGTCACCGACATT~~GAGTCGTAT~GCG~GG~TC~~GG~~TAGAGA~~~AC~~ 5499 CCCAATACTTCCGAGAATATGAATGTCTTACAG~~GA~GG~~CG~TT~CATC~CC~TA~GTAT~CGTATAC~~~C 5599 CAGAGGGAGTATTCACAGGA~ATGC~GGGC~~~GA~~GAGCTC~~TCTGATCTTGGGCTGATC~CA~GTATCGTA~~GAT 5699 GGCTn;GGAAGCACTAGCTGTGGAGAGAATGCn;CGAAATGAC~~TA~~T~TT~GACA~GA~CATATCA~~CCTGATCTAGGA 5799 TATTTGAAATAT~AGCGTAAAAAGGGGAGAGATCTACTCT 5699 GTGTATTGGA~CGACAAGAGGCAAGCAAGGGAGCCAAGG 5999 n;AATCGAAGACAGAGGTGCTGCAGT~~~CTGG 6099 CTCGAGAAAGTTGGAG~TCAGCGTTTGCGTPCCATGAC~CGCTGG~TGCAG~TCCGCAGTACGT~CT~G~AGC~GCATCA~GAC 6199 AGTTGA~~~~CACAGATATG~TT~~CT~GAA~~TTGCGAGATC~~~~TGATCGG~TT~CCCATCTAAAGT~CGCA~AC~~~CATTCCT 6299 GTCGATGCTGATAGGGCTGTTCTACAATAAAACATTTCGG Yeast Subtelomeric Repeats 567

6399 pYP1-ARC pYPl-LZ pYPl-Sl

6499 pYP1-ARC pYPl-L2 pYP1-s1 6599 pYP1-ARC pYP1-LP pYP1-s1 6699 pYP1-ARC pYPl-L2 pYP1-s1 6797 pYP1-ARC pYPl-L2 pYPl-Sl 6897 pYP1-ARC pYP1-LZ pYPl-Sl 6997 FIGURE 3. other short (up to 10 bp) direct and inverted repeats Y’ . .: ..I: : ...:I. I :... :: 1: 1 :I1 :I1 .. 1 are found throughout the sequence (some are indi- TIF 4 GITDIEESQIOTNYDKVVYK..FDDMELDENLLRG..VFGYGFEEPSAIQ 49 cated in Figure 3). There are no long direct or in- I - Y’ 52 ~CHEIYMADTPSPGYGK~~~LIRtA101 verted repeats though the flanking (GI-BT),, repeats I . .I.:...I .II..I 111: I :: :. ..: II . I:: could be considered terminal direct repeats. The se- TIF 50 QRAIMPII..EGHDVLAaAOSGTGKTGTFSIAAWPJDTS.VKAPOAIML 96 quence is not highly homologous to any sequence in Y‘ the GenBank data base [release version 66.0 searched : I I1 : :: .:.:I .:: .:I 1.1. :I:::I ... with FASTA (PEARSONand LIPMAN1988)l other than TIF 97 APTRE~IQKV....VMALAF~IKVHACIffiTS.FVEDAEG~~141

n~~ the previously published partial Y’ sequences. Y’ 152 FTDFUFAW..ENIVECTFRTNNVKLGYLIVDEF€lNFETEWRQSQFGGIT199 The putative protein has homologies with hel- ...... :II I11:.:l: :I:II .:: ...::: :. :I Y’ TIP 142 IWGTPGRVFDNI~RTDI(rKM..FILDEADEMLSSG~QIYQI~189 icases: The putative proteins fromthe two ORFs were m used to search the GenBank data base. One region Y’ 200 NLDFDAFEKAIFLSGTAPEGLTG~I~~EDL249 I .: ...::ll:l [:.I :.. I : :..: I has homology to helicases as indicated in Figure 2. TIF 190 LL..PPTT~LSRTMPNDVLE...... ~...... 222 The useof FASTA (PEARSONand LIPMAN1988) found homologies with mouse eIF-4A (RAY et al. Y’ 250 SRGLSSYPTRMFNLIKEKSE~~~~I~SQPEEAL..LLALF297 I:) I.I:.I: :..:: .IIII..I. I .I: 1985) and Drosophila vasa (HAY,JAN and JAN 1988) TIF 223 ...... LVK.KDELTLEGIKQFYVNVE...EEEYKYECLTDLY 255 which are putative helicases. There is also homology nr with E. coli UVRB (BACKENDORFet al. 1986), which Y’ 298 EIEPESKAIWASTTNEVEELS~.FRVVWIHGI(LGAAEKVSRTXE346 : . .. I::...I ..IlII...:l . I I 1.:.1 ... I: . II has helicase activity in conjunction with UVRA (OH TIF 256 DSISVTQAVIFCNTRRKVEELTTKL~KFTVSAIYSDLPWEFUITI~305 and GROSSMAN1989). The best homology is 27% _v _VI Y’ 347 FVTDGSMRVLIGTKLVTEGIDIKQIWlVIMLD... NRLNIIELIWGRI. 393 identity over 400 amino acids with TIFl and TIF2 I .::I 1:ll:I.l:. III:.I: :I1 .I I: 1.1. I. .ll: (LINDER andSLONIMSKI 1988) which are thought to TIF 306 FRSGSS.FULISTDLLARGIDVaQVSLVINYDLPANKEN 354 be yeast RNA helicases (see Figure 4). BESTFIT - Y’ 394 RDGGLCYLLSRKNSWAARNRKGELPPIKEGCITEQVRXFYGL435 (SMITHand WATERMAN1981) andGAP (NEEDLEMAN I:.. : .:. :I.. II..: .. 1.1 ..:: .I and WUNSCH1970) were used to align TIF1/2 to the TIF 355 GRKGVAINFVlNEDVWLMR... ELEWYSTQIEELPSDIATL 393 Y’ sequence. The quality of the alignment, 170.8,was FIGURE4.-Y’s potentially encode an RNA helicase. The align- ment of the Y’ sequence to yeast TIFl and TIF2 (LINDERand significantly greaterthan tenrandom alignments, SLONIMSKI1988) using BESTFIT (SMITHand WATERMAN198 1) is 124.0 -I- 2.7. All seven domainsconserved among shown. This alignment was significantly better than ten randomized helicases (GORBALENYAet al. 1989; HODGMAN alignments (quality 170.8 us. 124.0 f 2.7 for the random align- 1988a,b) are found in the Y‘ sequence. Within these ments). There is 27% identity along the whole length of the align- ment and 50% similarity. The seven conserved domains from domains, the sequence identityrises to 35%.Recently several known and putative helicases (GORBALENYA et al. 1989; GORBALENYAet al. (1 989) and KOONIN(1 99 1) subdi- HODCMAN1988a,b) are shown above the aligned sequences. The vided the RNA helicases into several families. The Y’ sequence identity within these domains rises to 35%. The“DExH” helicase-like domains fit the viral family (“DExH” in in domain I1 and the “QxxGRxR” in domain VI make the Y’ sequence more viral-like than like the “DEAD” family of helicases domain I1 and “QxxGRxxR” in domain VI) rather (KOONIN1991). than fitting the “DEAD” family of helicases (including TIF1/2, vasa and eIF-4A having “DEAD” in domain derance of short stretches of homology to viral poly- I1 and “HxxGRxxR”in domain VI). No otherstriking proteins over the entire lengths of both ORFs. homologies were found though there was a prepon- Y’s are apparently expressed:Total RNA isolated 568 E. J. Louis and J. E. Habet-

RWA: YP1 Y58 NOM “ each other than tothe Y’-Ls of theother strain. PROBE ABCD A A B C D Second, the Y’-Ss in two different strains are nearly RIIMt: ++++ + +-+-+-+- identical and different from the Y’-Ls of YP1 by 6 insertion/deletion differences. Third,the variation between different Y‘s (bothinsertion/deletion and single base pair differences) are nonrandomly distrib- - e22 - 127 uted, frequently disrupting ORFl while leaving most - 404 of ORF2 unaffected. - 908 Y’ variationincludes several insertion/deletion 242 - 2% differences: The two Y’-Ss (pYP1-SI and pY55-S1) - 217 are structurally identical and have two large (602 and - 201 - 190 - 180 465 bp) and foursmall (14,3 1,24 and3 1 bp)deletions relative to the Y’-Ls of YPl . The Y’-Ls exhibit struc- tural within-strain homogeneity and between-strain FIGURE5.-Y’s are transcribed. Lanes 1-4 represent YPl RNA divergence. The two Y‘-Ls of Y55 have a series of 11 protected by probes A through D. Lane 5 contains Y55 RNA small deletions (1 3-31 bp) relative to the four Y’-Ls protected by probe A. The remaining lanes contain the probes of YPl . Two of these deletions and the endpoint of alone, with (+), and without (-) RNAses to show that the probes were full length and that there was no nonspecific protection of the another are identical to those found in the Y’-Ss. All probes. End labelled pBR322 DNA digestedwith Mspl was used as of these deletions are found in the first half of the size markers. The low level of a 210-bp fragment forprobe B interval, disrupting the first ORF, and in the begin- indicates that Y‘-shorts are not transcribed to an appreciable extent ning of the second ORF. Most of the deletions are not is their appreciable DNA contamination in the RNA prepara- found between short direct repeatsof up to 10 bp(see tion. The lack of spliced product fragments for probe C indicates the lack of splicing. The lower molecular weight fragments seen Figure 3). Although some of these deletionsare in the are consistent with mismatches between the probe and someof the vicinity of the potential RNA splicing signals, none of message which would result in the cleavage of the protected frag- them coincide with the potential RNA splice junctions ments by RNAses. (see Figure 2B). There are two single bp insertion/ deletion differences found among the Y’-Ls. pYP1- from both strains was subjected to Northern analysis in order to determine if there was any transcription L2, L3 and L4 are missing one base relative to pYP1- of the Y’ elements. No detectable message was seen L1, disrupting the first ORF. pYPl-L3 and L4 have on Northern assays of total RNA; however, the more a compensating insertion of a base 155bp down- sensitive RNAse protection assay did reveal Y’ RNA stream. There is also a single bp insertion/deletion (see Figure 5). The probes used are indicated in difference in pY55-S1 relative to the other Y’s which Figure 2B. There was RNA homologous to probes disrupts ORF2 near the endof the region sequenced. from both ORFs. Furthermore, there was no indica- Y’s exhibit concerted evolution:In addition to the tion of appreciable splicing as a probe spanning the concerted nature of the insertion/deletion variation splice junctions (probe C) did not reveal lower molec- of Y’-Ls, subsets of the Y’s exhibit concertedevolution ular weight fragments expected for spliced message. atthe sequence level. The differences inpairwise Y’-Ss are apparently not expressedvery much as probe comparisons of Y’s is displayed in Table 1 for thefirst B should have detected a significant amount of frag- and second halves of the interval compared. The two ment at a lower molecular mass of 21 0 bp which is Y‘-Ss differ by only 5 bp even though they were barely seen. isolated from different strains. The two Y’-Ls from Y’ sequence variation: Figures 2B and 3 contain Y55 differ by only 7 bp. pYPl-L3 and L4 differ by the sequence information of all eight Y’s over the only 1 bp. The shared polymorphisms are distributed same interval (from bp 1759 to 4945 in pYP1-Ll/ nonrandomly. In the first half of the interval, pYP1- ARC). This region includes the 3’ end of ORFl and L3 and L4 share 11 polymorphisms not found in the the 5‘ end of ORF2. In addition the Y‘-Y‘ junction other Y’s while pY55-L1 and L2 share 2 1 polymor- sequences from two of the clones are shown in Figure phisms not found in the other Y’-Ls. In the second 3. The comparison of these Y’s confirms the original half of the interval these four Y’-Ls, two from YPl observation of concerted evolution based on restric- and two from Y55, share thesame 13 polymorphisms tion maps (Lours and HABER1990b) anddelimits the not found in the other four Y’s. The first half of the difference between the long and short size classes. A interval exhibits the same concerted variation and diagrammatic summary of these differencesand their relationship seen with the insertion/deletion variation. relation to the two ORFs is shown in Figure 2B. The Y’-Ls within each strain are more alike than Sequence comparison leads to the following general between strains and theY‘-Ss are moresimilar to each conclusions which are discussed more fullybelow. other than they are to any of the Y’-Ls. The second First, the Y’-Ls within one strain are more similar to half of the interval exhibits variation that contrasts Yeast Subtelomeric Repeats 569 TABLE 1 Percent divergence between shared sequencesof Y’s

~ pYP1-Ll pYP1-L2 pYP1-LS pY P 1 -L4 pY55-s1pYPI-s1 pY55-L1 pY55-L2 pYP1-Ll 1.233 0.1230.185 1.1710.185 1.110 1.048 pYP1-L2 1.149 0.062 1.0480.185 0.986 1.110 0.925 1.086 1.276 0.062 1.110 0.986 0.308 0.247 0.308 0.986 1.110 0.062 pYP1-L3 1.276 1.086 pY P 1 -L4 1.276 1.086 0.000 0.1851.048 0.370 0.925 2.500 2.250 1.750 pYP1-SI 2.250 2.500 1.1750.986 1.048 0.247 2.250 1.750 1.750 0.500 pY55-s1 1.750 1.7502.000 2.250 1.048 0.863 2.376 2.153 2.301 2.301 2.000 2.000 0.185 2.000 2.000 2.301 2.301 pY55-L1 2.153 2.376 2.523 2.301 2.301 2.301 2.250 2.250 0.297 2.250 2.250 2.301 2.301pY55-L2 2.301 2.523 The lower left half of the table is the percent difference in the first half of the interval (1566 bp in Y’-Ls of YPl, 1347 bp in Y’-Ls of Y55 and 400 bp in Y’-Ss). The percent is calculated over the regions shared so that in the case of a Y’-S versus a Y’-L of YP1, there are only 400 bp to compare. The upper right half of the table is the percent difference in the second half of the interval (1622 bp in all the Y’s).

sharply with the first half. There are essentially two ments are very conserved with an average of about sequence types in the second half each shared by four 1% variation among homologous regions. This is sim- of the Y’s. The relationships do not follow strain or ilar to known variation among single copy sequences size class divisions (see Figure 2B). between differentstrains. For example two functional There are 78 polymorphic base pairs among the URA3 sequences have 10/1170 polymorphic sites (7 sequences, 55 of which are shared by more than one of which are in 150 bpof the 5’ nontranslated region Y’. Again these are concentrated in the first half of (ROSE, GRISAFIand BOTSTEIN1984)) and two PMAl the interval (55/78), as are the deletions. Most of the genes (one from each of the two strains studied here) unique base pair variation (18 of 23 sites) is found in have 9/2754 polymorphic sites in the coding region the first 1000 bp of the interval in pYP1-L1 and L2. (PERLINet al. 1989). The variation between dispersed One fourth (20/78) of the polymorphisms are trans- duplicated genes in yeast varies from 5/1185 differ- version types (A/T, A/C, G/T and C/G). Only one of ences between the functionally equivalent TIFl and the 78 mutations creates a stop codon,in ORFl. One TIF2 (LINDERand SLONIMSKI1988) and 16% diver- third (27/78) are silent changes that do not change gence between the functionally nonequivalent PMAl the putative protein sequences, and 45/78 result in and PMA2 genes (SCHLESSERet al. 1988). For the the substitution of similar amino acids. There arefive tandem array of rDNAs, there is homogeneity within mutations to dissimilar amino acids. In seven of the an array for oneof two unit size classes (PETES1980). eight Y’s, ORF2from bp 3320 to at least 4945 is The nextadjacent telomere associated repeated uninterrupted by nonsense mutations or deletions sequence, X, is the most analogous to Y’s in that they (pY55-S1 has a single base pair frameshift at 4715). are dispersed and telomere associated. The known X This region contains the putative helicase domains. sequences, including the 232-bp region of pYP1-L1 , Within-strain differences vary over the length of vary by as much as 18% in shared homologous regions the Y‘s. The internal ends of Y’s within YP1 appear (LOUISand HABER199 1) which contrasts sharply with to be highly conserved while the telomere ends are the 1-2% Y’ variation. Some X regions at different much more variable. The first 159 bp of the 3.2-kb chromosome ends do exhibit greater homology indi- region sequenced in pYPl-L1, pYP1-L2 and pYP1- cating some degree of homogenization between some S 1 are identical (Figure 3) while the last approximately X’s but thereis a greatdeal more variation among X’s 520 bp of each vary by as much as 4.46% in pairwise than among Y’s. comparisons. They also contain many small (1-4 bp) There apparently is a border between adjacent te- insertions/deletions relativeto each other. Thisregion lomere regions that are homogenized and those that contains tandemthe triplication of T(T/ G)AGGGCTAT (containing the vertebrate telomere are not. The homogeneity in Y’s is over most of the sequence). pYP1-L1 and pYPl-L2 have three copies element including the end closest to the X elements. while pYP 141 contains only two copies of this se- This homogeneity can be explainedby recombination quence. (LOUISand HABER 1990a). If recombination is the homogenizing agent, is it restricted to Y’ elements? Why does not the homogenization extend to the ad- DISCUSSION jacent X elements? Y’ variationand evolution: The sequence and The telomere ends of Y’s are apparently more vari- structure of the Y’ elements provide a great deal of able within a strain than other regions (up to 4.5% information regarding their evolution as well as their divergence us. 1% for internal regions). This could be possible origin and function. In general, the Y’ ele- a reflection of the fact that this region is not in the 570 E. J. Louis and J. E. Haber

FIGURE6.-Reciprocal recombi- A nation among Y’s. A, When two dif- ferent Y’s (one striped, one solid) on different chromosomes recombine, segregation will result in the two te- lomereends being homogeneous while the internal ends are still het- erogeneous 50% of the time. B, All reciprocal exchanges involve the te- lomere end while the involvement of more internal sequences depends on theexchange point. A gradient of homogenization is expected from continued reciprocal recombination Internal end TeloAere end among Y’s with the telomereend Y’sequence being the most homogeneous andthe internal end being more variable.

ORF, however the internal end is also not in ORF. variation could be due to recombination between the Alternatively, the telomere endsof Y’s may be subject repeatsresulting in thedeletion of intervening se- to higher ratesof mutational change; forinstance they quences or couldbe dueto replication slippage may be subject to a higher level of replication error. (STREISINGERet al. 1966). Perhaps Y’s are more sus- Reciprocal recombinationamong Y’s would beex- ceptible to the processes (such as a different mode of pected to result in greater homogeneity toward the replication) leading to this type of insertion/deletion telomere end as it is always included in the recombi- variation. nation event while internal regions are notnecessarily Y’ variation is nonrandomly distributed within an included (see Figure 6). If reciprocal exchange is the element: Virtually all of the insertion/deletion varia- homogenizing force in Y’s, there must be additional tion and most of the base pair variation is found in factors involved in the generation or maintenance of the first half of the interval compared, disrupting the variation at the ends of Y’s. However, gene conver- ORFl but leaving most of ORF2 intact. The variation sions that do not extend to the endcan explain inter- in the first half shows the same relationship among nal homogeneity while allowing more variability at Y’s as the insertion/deletion variation with the Y’-L the end. variation following the strain division as expected for Y‘ variationincludes many insertion/deletions concerted evolution (Table 1). Mostof the unique which exhibits classical concertedevolution: The Y’- variants are found here.The second half of the inter- Ls of strain YPl differ from each other only by single val is composed of one of two sets of shared variations base insertion/deletions and base pair differences as (see Figure 2B) that leave most of ORF2 intact. This do the Y’-Ls of strain Y55. The two sets of Y’-Ls differ distribution of these two sets of variants doesnot by 1 1 short insertion/deletions. The known restriction follow the strain or sizeclass divisions. In Y55 it maps of the Y’s of the two strains are consistent with appears that the Y’-Ss and Y’-Ls may bediverging the entire sets within each strain being homogeneous from each other in that each is composed of a different (LOUISand HABER1990b). This distributionof inser- set of variants in this region. However in YP1, there tion/deletion differences fits the classical definition of are some Y’-Ls that have the Y’-S type sequence in the concerted evolution. The Y’-Ss could be considered a second half of the interval compared. This mixture second repeated sequence familyas they are main- canbe explained by recombination between size tained in the same cells as the Y’-Ls. The Y’-Ss differ classes in this region. This has been observed experi- from the Y’-Ls of YP1 by six deletions in the region mentally in YP1 (LOUISand HABER1990a). compared. There is a structural relationshipbetween The deletions may create a barrier to homogenizing the Y’-Ss and the Y’-Ls of Y55 in that two of the six recombination events when two different size classes Y’-S deletions are identical to two in the Y55 Y’-Ls are interacting. Short stretches of homology between and an endpoint of another is the same as those of the deletions in the Y“Ss or the Y’-Ls of Y55 may there- Y55 Y’-Ls. fore be expected to diverge from the same sequence Most of the insertion/deletion variation appears to in the Y’-Ls of YP1. This is in fact the observation. occur between short direct repeats of 2-10 bp. This Much of the variation unique to the Y’-Ss or to the insertion/deletion variation is unusual for yeast. In an Y55 Y’-Ls is found in the short stretches of sequence analysis of over 700 independent mutations in CYCl between insertion/deletions. in yeast, there were no deletions of this sorteven Multiple Y’ size classes coexist withinstrains: thoughshort flanking directrepeats were present Both strains studied here have both Y’-Ls and Y’-Ss. (ERNST, STEWARTand SHERMAN1981). This type of In addition, each has other Y’ types (degenerate Y’s Subtelomeric Repeats Yeast Subtelomeric 571 in Y55 (LOUISand HABER1990b) and extralongs in and LI 1987). The presence of a potential helicase in YPl (unpublished observation)). The strainstudied a conserved region is also consistent with this hypoth- by CHAN andTYE (1 983a)also has both size classesof esis as many viruses have these helicase domains which Y’s. The presence of multiple size classes could be a are thought to be involved in their replication (GOR- reflection of a transient state in the dynamics of Y‘s BALENYA et al. 1989; KOONIN1991). The presence of within a strain lineage. The processes of homogeni- the insertion/deletiondifferences is also consistent zation may not have had enough time to lead to the with a non-yeast origin. replacement of all Y’ types by one size class. Alterna- Despite a large amount of circumstantial evidence tively, there may be additional factors that allow for for amobile element origin,all of the movements and the maintenance of multiple types. For instance, non- interactions seen involving Y’s can be explained by random recombinational interactions among Y’s have recombination and most require a recombinational been experimentally observed in YPl (LOUISand HA- explanation (LOUISand HABER1990a). The argument BER 1990a). Y’S interactpreferentially within their against transposition is displayed in Figure 7. When own size class. The rate of between size class recom- the marker inserted intoY’ a (the donor)is transferred bination may be low enough to prevent the replace- to an end with one Y’ (the recipient) the end result ment of one type with another as this process is could be one of two types. The event could simply be continuously countered by the spread of Y’s to new the transfer of themarker sequence with nonet locations and the intra-size class recombinational ho- change in Y’ copy number or the transfer of an entire mogenization. The different Y’ size classes may even- marked Y’ resulting in the gain of a Y’ at therecipient tually diverge enough to be considered distinct re- end. Transposition of the marked Y’ to the recipient peated sequence families that no longer interact as end would always result in a net gain of a Y’ while predicted when recombinational homogenization be- reciprocal exchange or gene conversion within the Y’ tween elements is reduced (WALSH1987). sequence would not. The vast majority of events in- Y’ origin: Y’s are notdistributed widely among volve no change in Y’ copy number. Those events that yeasts. No Y’ homology has beenfound in distant do result in Y’ copy number changes including the Saccharomyces species with fewer than 16 chromo- acquisitions and losses of Y’s from chromosome ends somes (JACERand PHILIPPSEN1989). Amongthe most and the “popout” of ARCS are easily explained by closely related species, Saccharomyces paradoxus and recombination. The gain of a marked Y’ in Figure 7 Saccharomyces buyanus (NAUMOV1987), Y’s are found could be explained by unequal exchange in the flank- in all strains of S. paradoxus examined but in only one ing Gl.sT sequences. The Y’-Y’ junction of pYPl-L2 strain of S. buyanus (NAUMOVet al. 1992). Whatever is the junctionbetween the telomere of one chromo- Y’s code for is not necessary for being a yeast. This some with a Y’ and another marked Y’ (see Figure 1) distribution is consistent with a viral or mobile ele- which moved to that telomere. The Y’-Y’ junction ment origin for Y’s. Several structural features sup- sequence of pYP1-ARC contains Gl.sT sequence and port this hypothesis. The size of the element and the is consistent with an unequal intrachromatid exchange finding of two large overlapping ORFs is a general involving the flanking Gl.sT sequences found in the structure found formany viruses and mobile elements resident Y‘ from which the ARC form was derived. including the RNA killer viruses (ICHOand WICKNER Y’ function: The conservation of the second ORF 1989) and Ty elements (BOEKE1989) of S. cerevisiae. including the putative helicase may be a reflection of The fact that the ORFs do not appear to be spliced functional importance.This sequence may be (or have even with the splice recognition sequences presenthas been) important for the maintenanceof Y’s. The fact also been seen the Ty elements and the RNA killer that RNA from this region of Y’s can be detected viruses. However, no consensus sequence for the -1 supports the thought that there is some function as- frameshifting that occursin many viruses (JACKSet al. sociated with this region. Even if this region is func- 1988) including the killer viruses (ICHOand WICKNER tional, the conservation of this region in most, if not 1989) is found. The Y’s do not have long terminal all, Y’s remains to beexplained. As Y’s can move repeats (LTRs) or inverted repeats but the flanking around via recombination (LOUISand HABER 1990a) G1.sT sequences could play the role of LTRs. The why should all Y’s maintain this region while the finding that Y’s are only found at chromosome ends adjacent region (ORFl) is subject to many disruptive is not inconsistent with a mobile element origin, as mutations. A combination of selection and recombi- many mobile elements are site specific including the national homogenization of ORF2 would be expected yeast Ty3 element (CHALKERand SANDMEYER 1990). to result in homogenization of the adjacent regions The codon adaptation index(CAI) values for the two via hitchhiking. It should be possible to disrupt ORF2 ORFs are low which is consistent with a mobile se- in all Y’s of a strain in order to test whether there is a quence origin that has not had enough evolutionary function to this region. time to alterits codon usage to that of the host (SHARP LIU AND TYE(1991) found a protein that binds to 572 E. J. Louis and J. E. Haber

Donor end FIGURE7.-Y‘s move via recombination. A recipient chromosome end with a single Y’ in the transfer of selectable marker sequence (inverted triangle) from a donor Y’ can have one of two forms after transfer. Theend could have a singleY’ that now has the marker sequence or the end could have two Y’s, one with the marker sequence. Transposition would always result in the net gain of Y’ copy number at the recipient end. Reciprocal ex- change or gene conversion within the Y‘ se- quences would result in transfer of the marker sequence. Unequal exchange involv- ing the flanking telomere sequences,0, would Transpoeltlon: 0% 100% result in a net gain in Y’ copy number. The

Y’ Recornblnalon: 100% 0% observed frequencies of resulting recipient types is consistent with most if not all transfer Unequal G T Recomblnatlon: 0% 100% events being dueto recombination. Observed: .95% <5% the vertebratetelomere sequence TTAGGG found at transcriptionally silenced or diminished. For example, the end of some X elements and have proposed that a URA3 gene, transcriptionally orientedtoward a it plays a rolein telomere formation and maintenance. nearby telomereexhibits a markedvariegated position They suggest that Y’s may have similar sequences near effect in which it is only occasionally expressed. Con- their telomere ends that provide an anchor for this sequently the Ura+,Ura- mixed colony is able to grow protein. Indeed, Y’s do have TTAGGGs within 100 both on medium lacking uracil and on 5-fluoroorotic bp of the end; however, there are only 2-3 copies acid medium that permits only Ura- cells to grow. In separated by 4 bp each. Ananalysis of other X-Y‘ contrast, insertions of URA3 into a Y’, oriented in the junctions (LOUISand HABER1991) has shown that same direction and located within 1.6 kb of the end these vertebrate telomere sequences are not always of the chromosome, are not transcriptionally silenced. found at the endsof Xs. If this sequence is important URA3 insertions into Y’ are fully Ura+ and do not for telomeremaintenance, then Xs without grow on 5-fluoroorotic medium (LOUISand HABER TTAGGGs would have to always be followed by 1990a). TTAGGG containing sequences such as Y’s. The role of recombination in Y’evolution: Several Y’s may not encode a functional polypeptide but questions remain regarding the evolution, function may play a different structuralrole. One possible role and origin of Y’s. Recombination clearly plays a role is the recombinational dispersal of Y’ elements to in Y’ dynamics, however it cannot easily explain the many chromosome ends which has recently been ob- distribution and maintenance of the variation seen. served by V. LUNDBLAD(personal communication). The concerted natureof much of the variation is most Yeast strains that survive the lethal effects of muta- easily explained by recombination. The conservation tions in ESTl (LUNDBLADand SZOSTAK1989) have of one region of Y’s while others aresubject to disrup- experienced a global proliferation of tandem arrays tive variation requires factors in addition to recombi- nation. Possibly selection forone copy being func- of Y‘ elements at the endsof many if not all chromo- tional in a multigenefamily results in the maintenance somes. Mutations in ESTl result in the lack of main- of functionality in all copies. The apparent mainte- tenance of (Gl.sT), sequences. These longtandem nance of multiple Y’ types also cannot be easily ex- repeat regions may provide a buffer against the pro- plained. The existing structure of the Y’ families may gressive loss of a small number of base pairs at each be a transient state that has not evolved long enough chromosome end during replication at each cell divi- to result in the replacement of one type of Y’ with sion cycle. If thereare any transientstates in the another. Alternatively additional processes that result natural history of yeast where telomere maintenance in the maintenance of multiple types may be operat- functions are disrupted,this ability to spread a buffer ing. Finally, if recombination is playing a role in the sequence to all chromosome ends may be advanta- homogenization of Y’s why does not that homogeni- geous. zation extend into adjacent regions? There may be a A different possible function that Y’ sequences may barrier to recombinational homogenization between provide is the protection of adjacent sequences from X and Y’s. transcriptional silencing. Recently GOTTSCHLINGet al. (1 990) have shown that genes inserted within unique We would like to thank SANDIHARRIS, VICKI LUNDBLAD and DNA immediately adjacent to atelomere become RHONAH. BORTSfor comments and discussions. This workwas Su btelomeric Repeats Yeast Subtelomeric 573 supported by a Wellcome Trust Senior Research Fellowship to lack the telomere-specific Y’ sequence. Mol. Cell. Biol.9: 5754- E.J.L., a National Science Foundation grant DCB8711517 to 5757. I.E.H., a Sloan Foundation Grant to J.E.H. and National Institutes KOONIN,E. V., 1991 Similarities in RNA helicases. Nature 352: of Health shared instrument grant RR04671. 290. LANGFORD,C. J., F. KLINTZ, C. DONATHand D. GALLWITZ, 1984 Point mutations identify the conserved, intron-contain- LITERATURECITED ing TACTAAC box as an essential splicing signal sequence in yeast. Cell 36 645-653. ARNHEIM,N., 1983Concerted evolution of multigene families, LEE,J. J,, and N. A. COSTLOW,1987 A molecular titration assay pp. 38-61 in Evolution of Genes and Proteins, edited by M. NEI to measure transcript prevalence levels. Methods Enzymol. 152: and R. K. KOEHN. Sinauer, Sunderland, Mass. 633-648. UACKENDORF, C., H. SPAINK,A. P. BARBEIROand P. VAN DE PUTTE, LEIGHBROWN, A. J., and D. ISH-HOROWITZ,1981 Evolution of 1986 Thestructure of the uvrE gene of Escherichiacoli. the 87A and 87C heat shock loci in Drosophila. Nature 290: Homology with other DNA repair enzymes and characteriza- 677-682. tion of the uvrB5 mutation. Nucleic Acids Res.14: 2877-2890. LINDER,P., and P. P. SLONIMSKI,1988 Sequence of the genes BALTIMORE,D., 1981Gene conversion: some implications for TIFl and TIF2 from Saccharomyces cerevisiae coding fora immunoglobulin genes. Cell 244 592-594. translation initiation factor. Nucleic Acids Res. 16 10359. BOEKE,J. D., 1989 Transposable elements in Saccharomyces cere- LIU,Z., and B. TYE,1991 A yeast protein that binds to vertebrate visiae, pp. 335-374 in Mobile DNA, edited by D. BERGand M. telomeres and conserved yeast telomeric junctions. Genes Dev. HOWE.American Society for Microbiology, Washington, D.C. 5: 49-59. GHALKER,D. L., and S. B. SANDMEYER, 1990Transfer RNA genes LOUIS,E. J., and J. E. HABER,1990a Mitotic recombination among are genomic targets for de novo transposition of the yeast subtelomeric Y’ repeats in Saccharomyces cereuisiae. Genetics Ty3. Genetics 126 837-850. 124: 547-559. CHAN,C. S. M., and B. TYE, 1983a A familyof Saccharomyces LOUIS,E. J., and J. E. HABER,1990b The subtelomeric Y’ repeat cerevisiae repetitive autonomously replicating sequences that family in Saccharomyces cereuisiae: an experimental system for have very similar genomic environments. J. Mol. Biol. 168: repeated sequence evolution. Genetics 124: 533-545. 505-523. LOUIS,E. J., and J.E. HABER,1991 Evolutionarily recent transfer CHAN,C. S. M., and B. TYE, 1983b Organization of DNA se- of a group I mitochondrial intron to telomere regions in quences and replication origins at yeast telomeres. Cell 33: Saccharomyces cerevisiae. Curr. Genetics 20: 4 11-4 15. 563-573. LUNDBLAD,V., and J. W. SZOSTAK,1989 A mutant with a defect DEVEREUX,P. J., HAEBERLI and 0. SMITHIES,1984 A comprehen- in telomere elongation leads to senescence in yeast. Cell 57: sive set of sequence analysis programs for the VAX. Nucleic 633-43. Acids Res. 12: 387-395. MELTON,D. A,, P. A. KRIEG, M. R. REBAGLIATI,T. MANIATIS,K. DOVER,G. A., 1982 : a cohesive mode of species ZINN and K. R. GREEN, 1984 Efficient invitro synthesisof evolution. Nature 299: 11 1-1 17. biologically active RNA and RNA hybridization probes from ERNST,J. F., J. W. STEWARTand F. SHERMAN,1981 The cycl- plasmids containing bacteriophage SP6 promoter. Nucleic Acid 11 mutation in yeast reverts by recombination with a nonallelic Res. 12: 7035-7056. gene: composite genes determining the isocytochrome-c. Proc. NAUMOV,G. I., 1987 Genetic basis for classification and identifi- Natl. Acad. Sci. USA 78: 6334-6338. cation of the ascomycetous yeasts. Stud. Mycol. 30469-475. FICKETT,J. W., 1982 Recognition of protein coding regions in NAUMOV,G. I., E. S. NAUMOVA,R. A. LANTTO,E. J. LOUISand M. DNA sequences. Nucleic Acids Res. 105303-5318. KORHOLA,1992 Genetic homology of Saccharomyces cerevisiae GORBALENYA,A. F., E. V. KOONIN,A. P. DONCHENKOand V. M. with its sibling species S. paradoxus and S. bayanus: electropho- BLINOV,1989 Two related superfamilies of putative helicases retic karyotypes. Yeast (in press). involved in replication, recombination, repair and expression NEEDLEMAN,S. B., and C.D. WUNSCH,1970 A general method ofDNA and RNA genomes. NucleicAcids Res. 17: 4713- applicable tothe search for similarities in the amino acid 4730. sequence of two proteins. J. Mol. Biol. 48: 443-453. GOTTSCHLING,D. E., 0. M. APARICIO,B. L. BILLINGTONand V. A. OH, E. Y., and L. GROSSMAN,1989 Characterization of the heli- ZAKIAN, 1990 Position effect at S. cereuisiae telomeres: re- case activity of the Escherichia coli UvrAB protein complex. J. versible repression of Pol I1 transcription. Cell 63: 751-762. Biol. Chem. 264: 1336-1343. HAY,B., L. Y. JAN and Y. N. JAN, 1988 A protein component of OHTA,T., 1983 On the evolution of multigene families. Theor. Drosophila polar granules is encoded by vasa and has extensive Popul. Biol. 23: 216-240. sequence similarity to ATP-dependent helicases. Cell 55: 577- PEARSON,W. R., and D. J. LIPMAN,1988 Improved tools for 587. biological sequence analysis. Proc. Natl. Acad. Sci.USA 85: HODGMAN,T. C., 1988a Erratum: A new superfamily of replica- 2444-2448. tive proteins. Nature 333: 578. PERLIN,D. S., S. L. HARRIS,D. SETO-YOUNG and J. E. HABER, HODGMAN,T. C., 1988b A new superfamily of replicative pro- 1989 Defective H(f)-ATPase ofhygromycin B-resistant teins. Nature 333: 22-23. pmal mutants from Saccharomyces cerevisiae. J. Biol. Chem. HOROWITZ, H., andJ. E. HABER,1984 Subtelomeric regions of 264: 2 1857-21 864. yeast chromosomes contain a 36 base-pair tandemly repeated PETES,T. D., 1980 Unequal meiotic recombination within tandem sequence. Nucleic Acids Res. 12: 7 105-7 12 1. arrays of yeast ribosomal DNA genes. Cell 19 765-774. ICHO, T., and R.B. WICKNER,1989 The double-stranded RNA RAY,B. K., T. G. LAWSON,J. C. KRAMER,M. H. CLADARAS,J. A. genome ofyeast virus L-A encodes its own putative RNA GRIFO,R. D. ABRAMSON,W. C. MERRICKand R. E. THACH, polymerase by fusing two open reading frames. J. Biol. Chem. 1985 ATP-dependent unwinding of messenger RNA struc- 264 6716-6723. ture by eukaryotic initiation factors. J. Biol. Chem. 260 7651- JACKS, T., H. D. MADHANI,F. R. MASIARZand H. E. VARMUS, 7658. 1988 Signals for ribosomal frameshifting in the rous sarcoma ROSE,M., P. GRISAFIand D. BOTSTEIN,1984 Structure and func- virus gagpol region. Cell 55: 447-458. tionof the yeast URA3 gene: expression in Escherichia coli. JAGER, D., and P. PHILIPPSEN,1989 Manyyeast chromosomes Gene 29 113-124. 574 E. J. Louis and J. E. Haber

SAMBROOK,J., E.F. FRITSCH and T. MANIATIS,1989 Molecular SMITH,T. F., and M. S. WATERMAN,1981 Comparison of biose- Cloning: A Laboratory Manual.Cold Spring Harbor Laboratory, quences. Adv. Appl. Math. 5: 482-489. Cold Spring Harbor, N.Y. STREISINGER,G., Y. OKADA,J. EMRICH,J. NEWTON,A. TSUGITA, SANGER,F., S. NICKLENand A. R. COULSON,1977 DNA sequenc- E. TERZAGHIand M. INOUYE,1966 Frameshift mutations and ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. the genetic code. Cold Spring Harbor Symp. Quant. Biol. 31: USA 74 5463-5467. 77-84. SCHLESSER,A., S. ULASZEWSKI,M. GHISLAINand A. GOFFEAU, SZOSTAK,J. W., and E. H. BLACKBURN,1982 Cloning yeast telom- 1988 A second transport ATPase gene in Saccharomyces cere- eres on linear plasmid vectors. Cell 29 245-255. uisiae. J. Biol. Chem. 263: 19480-19487. VALGEIRS~IR,K., K. L. TRAVERSEand M.-L. PARDUE, SELKER,E. U., C. YANOFSKY,K. DRIFTMIER, R. L. METZENBERG, 1990 HeT DNA: a family of mosaic repeated sequences spe- B. ALZER-DEWEERDand U. L. RAJBHANDARY, cific forheterochromatin in Drosophilamelanogaster. Proc. 198 1 Dispersed 5s RNA genes in N. crassa: structure, expres- Natl. Acad. Sci. USA 87: 7998-8002. sion and evolution. Cell 24 8 19-828. WALMSLEY,R. W., C. S. M. CHAN,B. TYEand T. D.PETES, SHAMPAY,J., J. W. SZOSTAKand E. H. BLACKBURN,1984 DNA 1984 Unusual DNA sequences associated with the ends of sequences of telomeres maintained in yeast. Nature 310 154- yeast chromosomes. Nature 310 157-160. 7. WAISH, J. B., 1987 Sequence-dependent gene conversion: can SHARP,P. M., and W.-H. LI, 1987 The codon adaptation index- duplicated genes diverge fast enough to escape conversion? a measure of directional synonymous codon usage bias, and its Genetics 117: 543-557. potential applications. Nucleic Acids Res. 15: 1281-1295. WILLIAMSON,D. H., 1985 The yeast ARS element, six years on: SHERMAN,F., G. R. FINKand J. B. HICKS,1986 Methods in Yeast a progress report. Yeast 1: 1-14. Genetics. Cold Spring Harbor Laboratory, Cold Spring Harbor, ZAKIAN,V. A,, 1989 Structure and function of telomeres. Annu. N.Y. Rev. Genet. 23: 579-604. SLIGHTOM,J. L.,A. E. BLECHLand 0.SMITHIES, 1980 Human ZARAT,K. S., and F. SHERMAN,1982 DNA sequence required for fetal G gamma and A gamma globin genes: complete nucleotide efficient transcription termination in yeast. Cell 28: 563-573. sequences suggest that DNA can be exchanged between these duplicated genes. Cell 21: 627-638. Communicating editor: M. CARLSON