Protein Science (1995). 4:2424-2428. Cambridge University Press. Printed in the USA. Copyright 0 1995 The Protein Society

-~ ~ ._ -. "-~ ~~~ FOR THE RECORD New protein functions in yeast chromosome VI11

~~ ~~ .~ ~. ~~ ~ ~ ~ ~____.~~_____" __~~ .~

CHRISTOS OUZOUNIS,' PEER BORK,' GEORG CASARI, AND European Molecular Biology Laboratory-Heidelberg, D-69012 Heidelberg, (RECEIVEDMay 4. 1995; ACCEPTEDSeptember 6, 1995)

Abstract: The analysis of the 269 open reading framesof yeast &phosphate phosphatase, NH(3)-dependentNAD(+) synthetase) chromosome VI11 by computational methodshas yielded 24 new to regulatory proteins(e.g., hexose transport activator protein, significant sequence similarities to proteins of known function. putative BTF3-like transcription factor). For three particularly The resulting predicted functions include three particularlyin- interesting cases, we include alignments (Fig. 2) and a brief teresting cases of translation-associated proteins: peptidyl-tRNA description. hydrolase, a ribosome recycling factor homologue, and a pro- tein similar to cytochromeb translational activator CBS2. The YHR189w, a putative peptidyl-tRNA hydrolase: PTH (EC methodological limits of the meaningful transferof functional 3. I. 1.29) is a well-characterized cytoplasmic enzyme that cleaves information between distant homologues are discussed. peptidyl-tRNA or N-acyl-aminoacyl-tRNA to yield free peptides Keywords: computational sequence analysis; function predic- of N-acyl-amino acid and tRNA;its natural substrates proba- tion; genome evolution; translation factors; yeast genome bly are peptidyl-tRNAs that dropoff the ribosome during pro-

The yeast genome sequencing projectis making rapid progress, Yeast chromosomeVI11 and the completionof the entiresequence is expected in the win- Functional information:57% ter of 1995-1996 (Oliver et al., 1992; Dujon et al., 1994; Feld- Homologues: 71% mann et al., 1994; Johnston et al., 1994; Bussey et al., 1995; Dietrich et al.. 1995; Murakami et at.,1995; B.G. Barrell, pers. comm.; K. Kleine, P. Mordant, & A. Goffeau, pers. comm.). The original studyof chromosome VI11 (Johnston etal., 1994) reported the function of 46% of the probable 269 gene prod- ucts, either asexperimentally known or as predicted by exploit- lf ing evolutionary relationships to proteins of known function. As of the submission date of this paper, we have been able to add functional information to another 9% of the putative ORFs, by searching more recent versions of sequence databases and using improved similarity search tools (Bork et al., 1992; Koonin et al., 1994; Scharf et al., 1994). As a result, the level of tentative functional assignmentfor chromosome VI11 is now above 50% of all ORFs (Fig. 1).

Results and discussion: The newly identified tentative functions range from enzymes(e.g., chelate synthase, glutaminase-aspar- aginase, gamma-butyrobetaine hydroxylase, 2-deoxyglucose- Fig. 1. Information clock for the function assignment and homology ' Present address: AI Center, SRI International, Menlo Park, Cali- detection for yeast chromosome VIII. Information decreases clockwise. fornia 94025. 30, homology implies known 3D structure, and function is known Present address: Max-Delbrueck Center for Molecular Medicine, (mostly from homology);f, function known from experimentation;fs, D-13112 Berlin, Germany. function predicted based on significant sequence similarity to a protein Abbreviations: RRF, ribosome releasing factor (ribosome recycling in the database; s, significant sequence similarity to a protein of un- factor); PTH, peptidyl-tRNA hydrolase; CBS2, cytochrome b transla- determined function; ?, no significant sequence similarity to any data- tional activator; ApbA, a protein in a genetic locus involvedin thiamine base protein. Note the large proportion of structural and functional biosynthesis; ORF, open reading frame. homologues.

2424 Proteins on yeast chromosome VI11 2425 tein synthesis (Garcia-Villegas et al., 1991; Meinnel et al., 1993, The implication of these similaritiesis that RRF also occurs and references therein). PTH is a rare enzyme with about 25 in plants (Fig. 2B), and we therefore predict that this final step molecules per cell (Dutka et al., 1993). There are several hypoth- of translation is ubiquitous in eubacteria and various eukary- eses regarding the role of PTH in cell growth. otic phyla. The evolutionary distancesdo not allow, however, YHR189w and Escherichia coli PTH have about 200 amino an unambiguous functional prediction, as the plantsequence is acids each and are 27%identical over 183 residues (Blastp score much more closely related (45% identity) toE. coli than to the 85,probability P(N)7.9 x The significance of thissim- yeast homologue (23% identity). ilarity is strongly supported by the conservation pattern in a mul- tiple alignment of several other prokaryotic proteins related to YHRO63c, a mitochondrial translation activator? The protein PTH (Fig. 2A). The sequences cover a wide species spectrum sequence of YHR063c is similar (25% identity) to ApbA pro- within bacteria and havediverged considerably (less than 36% tein from Salmonella typhimurium LT2, which is involved in thi- identical residues between different prokaryotic PTHs).We pre- amine biosynthesis (Downs & Petersen, 1994), and similar to dict that all of the proteinsin Figure 2A have peptidyl-tRNA hy- CBS2, a nuclear-encoded mitochondrial protein on chromosome drolase function. IV of Saccharomyces cerevisiae(Michaelis et al., 1988) involved in the translational activation of mitochondrial cytochrome b YHRO38w, a ribosome recycling factor homologue: Ribosome mRNA (Michaelis & Rodel, 1990). The similarity of YHR063c releasing factor is responsible for the dissociationof ribosomes with ApbA (Blastp score72, probabilityP(N)8 x IO-'; Fig. 2C) from mRNA after translation termination (Ichikawa & Kaji, is stronger than that with CBS2 (Blastp score 69, probability 1989) and is also called ribosome recycling factor (Janosiet al., P(N) 3 x IO"); a motifsearch in thesequence database 1994). RRF appears tobe essential for growthin E. coli (Janosi (Tatusov et al., 1994) exclusively matches the threerelated pro- et al., 1994). Searching databases with YHR038w, the nuclear teins with a strict cutoff value of 0.005. protein D2 from the plantDaucus carota (S. Schrader, R. Kal- Despite the incomplete functional characterizationof ApbA, denhoff, & G. Richter, 1993, unpubl.) is identified as clearly we predict, based on the data from CBS2, that these proteins similar(Blastp score 77, probability P(N)6 X whereas represent a family of factors that are involved in (mitochondrial) its well-characterized bacterial homologue RRFis less similar, translation. Organelle genomeshave been significantly reduced at 23% sequence identity (Blastp score 50, probability P(N) during evolution (Gray, 1993), and their translation mechanisms 3 x IOp4). Both similarities were confirmed by sensitive motif mostly resemble prokaryotic systems. Thus, CBS2 or YHR063c searches (Tatusov et al., 1994) with conserved regions. may even be nuclear-encoded mitochondrial proteins of pro-

Table 1. Novel functions in yeast chromosome VIIIa

~~~~~ ~~ . ~~ ~~~ ~~~~~ ~~~ ~ ~~~~ ~ ~ ~~ ~~~~~ ~- - ~~ Query Access Identifier Protein family and/or predicted function

~~~ ~ . ~~ ~~~ ~~ E ~ ." -~ YHL02lc P80193 BODG-PSESK Gamma-butyrobetaine hydroxylase YHL003c P28496 YKA8-YEAST TRAM/UOGI family YHR003c P36101 YKC7-YEAST THIF/HESA/MOEB family YHR032w D24 172 OSR 14662A DNA damage inducible Dinf-like YHR038w PI6174 RRFEECOLI * Ribosome releasing factor YHR043c U00062 YSCH8179-15 * 2-Deoxyglucose-6-phosphate phosphatase YHR044c U00062 YSCH8179-14 * 2-Deoxyglucose-6-phosphatephosphatase YHR058c PI3511 CZCA-ALCEU Cation efflux system proteins YHR063c P37402 APBA-SALTY APBA cytochrome b translational activator YHR074w 403638 NADEERHOCA NH(3)-dependent NAD(+) synthetase YHR075c Q03565 YKD7LCAEEL Lipase/hydroxylase family; chelate synthase YHR090c U 13948 HSU13948-1 Zn-binding PHD-finger YHR093w U00060 None * Hexose transport activator protein Y HR099w L34075 HUMFRAPX-I FKBP-rapamycin associated protein YHRII5c 406003 GOLILDROME RING finger YHR137w X78503 RMMOCCABR-6 mocR gene product YHR138c PO1095 IPB2-YEAST * Protease B inhibitor 2 YHR139c S33203 S33203 Glutaminase-asparaginase YHR146w 214127 SCSPMl Glucose repression protein GAL83 YHR154w X62676 SPRAD4- I RAD4 gene product (X pombe) YHR160c PO8468 PTl1 _YEAST PET1 11 protein precursor YHR16lc S27867 S27861 Clathrin assembly phosphoprotein YHR189w P23932 PTH-ECOLI * Peptidyl-tRNA hydrolase YHR193c 228479 HSB20A092 Putative transcription factor, BTF3-like

~ ~~ ~E ~~ "

a Query: identifier of the yeast chromosome VI11 ORF. Access, Identifier: database accession number and identifier of the closest homologue. Protein family and/or predicted function: familyor function as deduced from the detected homology. Bullets indicate the three cases presentedin detail. Asterisks indi- cate the cases that are detected automatically by standard software. 2426 C. Ouzounis et al.

LnNNODDO 0 -0omIDo 0 "DDWDDO m IDWID or- P oooLnocr n. m"4wm 0 4" 4" 4 """ N 0-0 0 mmlD 0 PLnlD 0 PLDO 0 dW w LDPd N ONP m wwm CT 4 d "4d d 3dN N ea4 1 0,- I aa%a$za a MIDO 0. 4YLlil44 4 W11III I ~nxI mvllm m X Birr I I >xx I x0, I aaaawx a 4lIIII I EPH 1I:irr I XZ14 I 0 4.11 I I 4>> I Ha I E?xo0,zz=< I X>I I1I I la01 I n I Iv) I =I= 4 LULUrLUz-LU Lr LUX1 I1I I z 17- I ai)- 15- I zwzwxC" I 4411m1 I irr 11- I "7 ~I~C> D.fnfnW~> I xzLr1VI1 I 7 I101 I w IIIIIU I zC0,CU)a I LlLUzwx1 I f.zJ 2 z I IX I vllm IIIIlrn I 21I IIW I !XWfnXXI I I U I B ~nx w nxa I =-w I ZZVfTWl t I= x J@il il w X wxw I wo I CZZrLi.1 I m: lad I 4il W anx I uwv) I vw I xx>xsv I 4>Ur4l I @ lHil I -ha: I vllm-20 I WWU I 0,- I 0,wfnawx I xwxw0,1 I a: I= I I I 4>zI cr I "fnU€-+l I CilWZUl I roaz CYw7 v1irr CLUX I wx I -ilu>e4 I pJ>=1 rl 1v)Z I In XWa I -4CY I >w4 I >> I nfnfno)zv) v1 uCvxau1 I IMI n a:avllm 1 wen I aww I xx I xxaxxa x nxzww~ I IE~I a:14WX I 7 I VIWfn I I 4H.aU.24 4 CWX4X:m I zw I IU I -7-2 I r-l I err I Lrw I I F BW'HU I Iw I v) xxx X vllm r4 I I 0,- I muzzzzzz z 3;:v)zwzr I I 10 I I u)ha I X uw I 1-z I xxxxxx x z44wLUw I pJz I IW I I il H LUzw I LUC" I uwczuu w ~Wpcw I IW I mlB> LU L"lwwulzwa I HUVf-44 I H a: I r xxx I aa. I m4>a4u I u4-r3;4 I I I> I H X B I 42x 10.0, I wlnwvIw0, I axna~"aI I IX I man I R a 0 wwx 14- I IIIIIC I 4>HX:V)3 I I IW I XEh I a "LC I wzaI WC I 4W44H4 I (C)mpfn" I IC I 11 aww w il aSlx1aa I wxa~lnaI wxwLlw4 I I 114 I mn.x I a:WC I X 1- 1- I ow 4 4E-wzwa I I Ih I h h CY C"wLU I aw I xaxxaa I - m: 7 I 1- I 5 14 4 il 4-4 I w4 I LUil44LU4 il 4<.afnwm I 4 I ILo I h 14 azw I 0,w I xaaauw a wwLrxxw I g n a0 BzI= 4 ~-UHH> Y I,,-@- I I IX I W 14 ma& I .r I aa I >XX>WX I HU*ULU4 I I IX I il awx €z * v) ow I I 1 wxfn 0,x 71w &grXm 4 I IS I w VI ilma: I 8 <<> I aa I PC4ilb.U I Xa.X>UC I 114 I w f7 I 0 mn I= 4 ~"n.~.n~aI wawwzw I B E: I IH I 7 gcxvllm a E 40x I zil I 1-p> 4 I I Adad .s I IH I x>> > nnil r( > -I= 0 nnnazv) I I I -7axu I a 4 >U4ldd I =="-a I I IOCXI I I ILU I nk-iil H 7aa I I 8 Lr>411z I mLKzzzl w I lzilal I I IC0 I I nv)d I n 8 3;zu)llr I Io ok'n n 01 o wwwwww w 1101 I D mn W w m .xFrb.l 14 I xr4x>>U > >xxzLUC" I I Ib I pj la I g2i z5 iiH I I <>>II> I >>>MU> > PlCuwVle I M I Iv) I I I114 I I 4e..azil> il Cvm[ Fr i; I la I n. I I> I X ; KJH44"x I xrnawz4 I H I IZ I mi ;zj wnzzwa I WW~0,VIW I z I la I a MIIH c nwxawa I I Ib I I zzw 2 xxx X 4241 IW I 0.Z".a>W I BT 114 I % a n4n.D. 14 P7was I aw0,I I4 I znuafnr I I la I c3 h a il -1 I I la r mxmx uov~wna I In. I ah0 I il I wzWl101 I an~ww-x I >>xx4w * I I> I Wax I W X axil1 IC" I LUrLULUFrr Lr I Ih I ilpJ U .a I WWWI lU I LUrn=L"a I EE14wHIIIllo) 7 a I I> I I fnov1oaa I II1114 I 141w n vlxw I E:$j 2 I xxo~wnI Lo ~ a: HZil l 44ill ICI I aD.rn3.4x I zzazwa I I IV 1 I I aa:H a w aa!%x4u I xuz0,o)v) I -1 la I12 I CY U c >.zilllr I 74LUl>> I I I I Iwa I I lil I %E-x X a: a YYUll4 I 4u41aw I 111 10.0, I I la I v)XO I X a ,2741 I> I xwfn IXW I OzIcZw I R 114 I EhH I wwn w H x441 I4 I u>i)zma I IIh I -xu1 la I 0.0.zwC"w I la1 I > z c1=11s I aamnmaa I IIh I I= w 0-2- I a x11 IIX I uwwxww w 114 I I 7il I X il IIIII" I UU>U>U U EPEI k @j i I IH 1 I v€- I 4-7 I 4 IIIIIWl I LUrr1W.r wwwwww w - I II: I I= X v)~nI > IIIIIZ I m"ULrclH - Y) m m m Y) m.4 r -42nDuL 2 .-3nmu3 .43nmu* ; o4ma-43 2 m.4 3 2 2 m.4 3 2 3 mMzum -4olMzvm 2 ohm ohm c 2 ZeEYZ c 2gm c u3w gZN>;>s 4 0 "....a% u"Js=-C4 : 22z % 2 gz k?j 2 om c 9 0- c: wm&,uyu I l0.CvC"F 2 :z22zz nwu c 22% 2:~ cv r 8 cvI lC"0.n.c 8c cuTlzzz; u2 auI IC r a'uIe du&IS z alu '5 sz urn urn - s u" u" 4u n.0. C"a 2% 4u 0 3u u) fn za za 2% ZE a m Proteins on yeast chromosome VIII 2427

mmm N PN- w wmm NIL)- m 4-0- o YlPP m 3" -4- - N" N 0" d 3" 3- ANN -N NNO 0 NOCI -

I 0 I o 1x1 I llaq m a I t.m I I =4e.r I I ua I nu> I I= I I men I I= I 65-0 CX" c.5 c om 1?vj zwu I I we I 8 0 I I om I m'Z2 E 3 4um I I Irn I I I I wt. I - Q :m .C xE-mr4k 0 I I I> I a:CW I I&> I I20 I b.zg $3g: I I In I >mm I I zm I I H-1 I g:g t; d .g 22 .- m>c I I I> I om> I I >r I I -4 I ;.::>aa- 1% w P.0 I I I I c,z I I z ru I @ c fi:G -3 I ZP I F: I zz4 c Mm I H >WO I I wz I rlmx I B Lr I I= c I ac I UZC I ~E&g~ I -c I W I wm I 12.2 I 4-1- I 'CI w .- I a0 I Z I I> I Iw I I me 4 9 ZLXZ Eok m Lr ,,A I IW I I "20 I % i?&ZMPE 84 W rl I IO I IZ # I X B u) x3 E'C B B+l 8 I I I I_ I IW I I SZW I 5 2s -+I=& €.a= I 8 P. I IW I I_ I I I 473-2 4 0 rm I U I I- I lrl I B 4 -0 'Z B c ._ $F -am I 12-2 I I IW I lX I I -1 2i?Eqpj% rlm x I xw I I IM I Iw I I WWB I <: s"cs x2+ z u> I I MO I I xa I ITI I TWrl I z.ge;&b-8 H I ne I 14-1 I It( I I mrl I M QZE B m I ea I I Lra I 101 I I e- I p'2gs.$s v-x I I= I I xm I le I I I xn I g.asm3 s rzm I I -a I I" I I= .L I= 2 'Z $4 P. zc, I YYr I Wmw "M I I >x I mg 2 owz I z wu I 0- E I mea. I X 4<;;izgscG; "U z -up: I 4 m r :m V n.2 bm I= L >C I I rl rl wum I I uz I c ..$$ Ag I w I x I I zu I $4w&i*p rl a I I ax I w 4>r fzj: U .E u;SZ+~L IO% I 41.;; OmL: p2J c rl m rl I 2z> w Ob 0 a 0 I M I -1r I 2 m.?.QcL,& 2 I W s>OW I IZI I 042 z zoc Lr,.+.-E55 .. >=4 -3 z UHrl 141 I oux I B W 8 I I IZI I ,$g z 2 T,$ .z m wx I z wux I W U:pX I +g2&- & 3 0 I n I uuw U QI m WE3 m y 0.2 <'Z v" Mn I H I g I- U.44 4 L m 2,u k" .. x &z I -1 w v-rl I X & mze::"$ go -mx wa-m 0 I -1 I I > u c ;*l -1 H I Icc I 4c x I I z .-M:Jsz ._ a4a-1 I z I no I I 2. > c I IW I Y W %.ij g;.-zn% I zzcczw B :H > I I la I I z a :;,x%"LE I I rlyl 1 PI I Irl I ow0 I I I lM I cv I g z;;i4+gzz 14- I 8BY2 u r I I> I e -4 I I li I mitm w I I GzzZ EL, g I I I p I lX I I >m muP. - 2 8 9FsZa I la I a I >u I I w w E.5 I IO I rl m x E"" C 023: g I ZM I 2t.r r x rl I !g $Z+ ggk I I I u I cLII %>PI :auxw @ W E g c m & s$j I= I wum I -0- I W rl 'D 'G g ,? w I rl W"M 1 UL.0 I I uo am z +Om"asg" LUC%.C am 1-34 I z 40 I "ZM I I ,x I urlz I u-u 12- I e w4 I mw.2 I I l* I I ZiE E -0 QZ Ins I M xc I cwu I > & o -02 Lrrj 3 s; m I u e- a cwpl I 0 0 sum "0- Q m ?G U I web I a I ,z.i7j;.1 u e, ngz.y 29 rl I ZZW I I x 4rl E I 4n.o I I & m zszj., $?QZ I I I 4 w p-5?&..WL W X X fje. -mu I s 0 x-+ I rl =X e:&, E a: z 'S > ~~ad-0Z !gJ H a Warn I E a: c w k a.G .. .y 9 e: I I - W > uxm z zwzq&& I I ZZLr I -x- I z W rl W U I m0 L-m +VI228 La-, @ -1 z nu- I I g$n w e% coez uurl I I i I a @; z 2.2 VI CG g&s em* I rl" u W P I 5.5 22Zk .."m m u w *a a 4 I g l xcu I I14 I g m rl a 2L. PP0 0 %?,$w EmI rn X I Zyl9 1 ,I- I I zx I wc.2 M 8 2 px4jj I .x- I I IW I I= I I B > 5 gg .& g,q I Tm I I I >m I I t.wrl I Iayl t x I wrl I rl rm m 8 8.; 2.?G$- rl m I W u>Lr I 14- I crl I40 uwe MCAZ%$Z.=?S.; I rl I xLr I I I 7.- ea;? 0.c I CZ g E 1x1 I Lr e- I I= I am Lr Ez I "g8gsLdf2 z;,s5i 3 n n on "7 xuu a -0- 2 x0 L1 a x0 L) a xur n n n Y- n YO n YO n c YO n c UO n c on c n 202.- m 2zg3 4- 4 4- m 4- m 4- m m ~r E, e nm 0I mo01 0 mo01 --mo 0, mo % a0 % m *r % m *r m *w. m It. : m Ir : mlEN' 0 m'eN10 -'E Nj 0 me N' me N( 0 nnU nnV nnV nnV nnV aafl $8 $5 $3 23 0 2428 C. Ouzounis et al. karyotic origin transferred from the endosymbiotic mitochon- 1992. Comprehensive sequence analysisof the 182 ORFs of yeast chro- drial genome to the nuclear genome (Thorsness & Fox, 1990; mosome 111. Protein Sci 1:1677-1690. Bussey H, Kaback DB, Zhong W,Vo DT, Clark MW, Fortin N, Hall J, Nugent & Palmer, 1991), where they have further duplicated. Ouellette BF, Keng T, Barton AB, et al. 1995. The nucleotide sequence of chromosome 1 from Saccharomyces cerevisiae. Proc Natl Acad Sa Conclusions: For newly sequenced yeast proteins, the rateof ho- USA 92:3809-3813. mology detection in current sequence databases has surpassed Dietrich FS, Mulligan J, Hennesy K, Allen E, Araujo R, Aviles E, Berno A, Brennan T, Carpenter J, Chen E, Cherry JM, Chung E, Duncan M, the 70% level (Bork et al., 1994; unpubl. results). The level of Guzman E, Hartzell G, Hunicke-Smith S, Hyman R, Kayser A, Komp identification of probable function from database similarity C, Lashkari D, Lew H, Lin D, Mosedale D, Nakahara K, Namath A, searches is more difficult to estimate. The proposed functions Norgren R, Oefner P, Oh C, Petel FX, Roberts D, Sehl P, Schramm S, Shogren T, Smith V, Taylor P, Wei Y, Yelton M, Botstein D, Davis R. (Table 1) are, for the most part, the functions of related data- 1995. Thecomplete sequence of Saccharomyces cerevisiae chromo- base proteins. However, depending on thespecies context and some V. Forthcoming. evolutionary distance, the similarity of function between related Downs DM, Petersen L. 1994. apbA, a new genetic locus involved in thia- mine biosynthesis in Salmonella iyphimurium. J Bacieriol176:4858-4864. proteins varies substantially. The assigned functions should, Dujon B, Alexandraki D, et al. 1994. Complete sequence of yeast chromo- therefore, be treated as plausible hypotheses, ranging from cer- some XI. Naiure 369:371-378. tain to approximate. Withthis caveat in mind, we conclude that Dutka S, Meinnel T, Lazennec C, Mechulam Y, Blanquet S. 1993. Role of the level of plausible functional identification by the 1-72 base pair in tRNAs for the activity of E. coli peptidyl-tRNA hydrolase. Nucleic Acids Res 21:4025-4030. methods will surpass 60% as theyeast genome sequence is com- Feldmann H, Aigle M, et al. 1994. Complete DNA sequence ofyeast chro- pleted. For some classes of proteins, however, only experimen- mosome II. EMBO J 13:5795-5809. tal work can ultimately elucidate function. Garcia-Villegas MR, De LaVega FM, Galindo JM, Segura M, Buckingham RH, Cuarneros G. 1991. Peptidyl-tRNA hydrolase is involved in A in- hibition of host protein synthesis. EMBO J 10:3549-3555. Data and methods: The 269 sequences of putative proteins from Gray MW. 1993. Origin and evolutionof organelle genomes. Curr Opm Genet yeast chromosome VI11 (Johnston et al., 1994) were obtained Dev 3:884-890. from the Saccharomyces Genomic Information Resource at Ichikawa S, Kaji A. 1989. Molecular cloning and expression of ribosome releasing factor. J Biol Chem 264:20054-20059. Stanford University School of Medicine via anonymous ftp. Janosi L, Shimizu I, Kaja A. 1994. Ribosome recycling factor (ribosome re- Each single protein sequence was subjected to a variety of ho- leasing factor) is essential for bacterial growth.Proc Nail AcadSci USA mology search tools that retrieve information from numerous 91:4249-4253. Johnston M, AndrewsS, et al. 1994. Complete nucleotide sequenceof Sac- sequence databases (for details, see Bork et al., 1992; Koonin charomyces cerevisiae chromosome VIII. Science 265:2077-2082. et al., 1994) using GeneQuiz (Scharf et al., 1994). For initial Koonin EV, Bork P, Sander C. 1994. Yeast chromosome Ill: New gene func- searches, programs of the Blast series (Altschul et al., 1990, tions. EMBO J 13:493-504. Meinnel T, Mechulam Y, Blanquet S. 1993. Methionine as translation start 1994) were used with default parameters (including maskingof signal: A review of the enzymes of the pathway in E. coli. Biochimie compositionally biased regions; Altschul et al., 1994). Using 75:1061-1075. ClustalW (Thompson et al., 1994), multiple alignments were MichaeliF U, Rodel G. 1990. Identification of CBS2 as a mitochondrial pro- constructed and evolutionary trees calculated. Weak sequence tein in Saccharomyces cerevisiae. Mol Gen Genet 223:394-400. Michaelis U, Sclapp T, Rodel G. 1988. Yeast nuclear gene CBS2, required similarities were inspected and distant homologies verified by for translational activationof cytochrome b, encodes a basic proteinof pattern search procedures (Tatusov et al., 1994). 45 kDa. Mol Gen Genei 214:263-270. After all ORFs hadbeen analyzed, entries were annotated (by Murakami Y, Naitou M, Hagiwara H, ShibataT, Ozawa M, SasanumaSI, Sasanuma M, TsuchiyaY, Soeda E, Yokoyama K, Yamazaki M, Tashiro assigning features of interest, such as phylum information or H, Eki T. 1995. Analysis of the nucleotide sequenceof chromosome VI functional classes) using a relational database system and expert from Saccharomyces cerevisiae. Naiure Genet 10:261-268. modules incorporated into GeneQuiz (Scharf et al., 1994). A Nugent JM, Palmer JD. 1991. RNA-mediated transfer of the gene cox11 from the mitochondrion to the nucleus during flowering plant cvolution. Cell complete list of functions assigned to proteins fromyeast chro- 66:473-481. mosome VI11 will be accessible via Internet using the URL Oliver SG, van der Aart QJM,et al. 1992. The complete DNA sequenceof http://www.sander.embl-heidelberg.de/genequiz/yeast/chro- yeast chromosome 111. Naiure 357:38-46. mosome8. The searches reported here are valid as of Novem- Scharf M, Schneider R, Casari G, BorkP, Valencia A, OuzounisC, Sander C. 1994. GeneQuiz: A workbench for sequence analysis. In: Altman R, ber 10, 1994. Brutlag D, Karp P, Lathrop R, SearlsD, eds. Proc 2nd Intl Conf Intel1 Syst Mol Biol. Menlo Park, California: AAAl Press. pp 348-353. Tatusov RL, Altschul SF, Koonin EV. 1994. Detection of conserved segments References in proteins: Iterative scanning of sequence databases with alignment blocks. Proc Natl Acad Sci USA 91:12091-12095. Altschul SF, Boguski MS, Gish W, Wootton JC. 1994. Issues in searching Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTALW: Improving the molecular sequence databases. Nature Genei 6:119-163. sensitivity of progressive multiple sequence alignment through sequence Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local weighting, position-specific gap penalties and weight matrix choice. Nu- alignment search tool. JMol Biol215:403-410. cleic Acids Res 22:4673-4680. Bork P, Ouzounis C, Sander C. 1994. From genome sequences to protein Thorsness PE, Fox TD. 1990. Escape of DNA from mitochondria to the nu- function. Curr Opin Siruci Biol4:393-403. cleus in Saccharomyces cerevisiae. Nature 346:376-379. Bork P, Ouzounis C, Sander C, Scharf M, Schneider R, Sonnhammer E.