NEDO -PR- 9601 . t-Afr 9 <*, t

y y a m m m;s'i

¥fi69^3fl OF THS D0< nr §NS

SrX^;i/dr —

SE5fe tt0>iA ?T Y)/3? X h U — DISCLAIMER

Portions of this document may be illegible electronic image products. Images are produced from the best available original document. mm) 140

(m ^ a m)

ZUMtLfzo 8 B X

(##%) R% i. m ^ mm sbs i ftJX v n. #&#%%##

1$ (±cAc # @±

1. $ i

2. i s. 3 4. 21#E#y/A#$rxtHtT 3

#2# rfJ cf fMf 6 i %#±% rfrW #tf 7 i-i rf: W #ff 7

i - 2 ##m# 14 (1) 14 (2) 24 (3) xk# w- 27 ( 4 ) iSlS# -t 7* )V <7)fSS TUa mi 34 (5) 'E-60te6O#S TfrW mu 40 1-3 SX f'JX 44 (1) 44 (2) mm 48 1 —- 11-4—4 Mmzx&nfe htb #± 51

(1) cDNA-fishing urn 51 (2) hybridization HTB 54

1-1-5 ww 55

1 - 2 HTB 58

2. 59

2 -1 a 7 4 mem 59

2-1-1 Fluorescent Differential Display(FDD) mem 60

2-1-2 Molecular Indexing mem ## 62

2-1-3 cDNA Microarray mem 63 2-1-4 c 66 2-2 ####? 72

2-2-1 #* 72 2-2-2 m @z (## 74

3. &#?)#%?### 84 3 -1 84 3 - 1 -1 1%^ c 84 (1) e 84 (2) e 85 3 - 1 - 2 HI S3 92 (1) HTB 93 (2) y;i/y7 T& HTB #2 93 (3) 7 y TXV > T& ## $ 95 (4) >##& HTB #± 98 3 - l - 3 om#; HTB mz 98 (1) t^x^xx? >vi HTB #± 99 (2) 77-/^4'7'V'yK&/7 — /\/f7'V yK& e# $ 99 (3) 7 T-y^4 77 W# HTffi 103 3- 2 105 (1) X# 105 (2) amwuy < i/-y 3 > 109

4. a ##?-### 113 4- 1 ? >/< ? a# 113 4-1-1 NMRi: 114 4-1-2 a* ag 118 4-1-3 3D-lDT70^>f 120 4-2 123 4-2-1 y->^-yyf4>7 ##* 123 4-2-2 ### 125 4-2-3 7 < m fiJX 127 4-2-4 SX fiJX 129

3 # m 132 2. BB 136

4$ >/<7 hm#l 137

M# fnS 140

. (####) mam## i##m ^i#i#a##m 141 :## X4ff?/oy —k>?- ?l&8 m 143 5 x./is

(NEDO) (m) x/ft/ryyxhv-m^ (JBA)

mmADNA%#^DNAy-ir>y>ym#(o^%(:zo,

#< % Cfi6DNA<7m&@a#lf'-?/<-%#,

fc-c, C<7)#^:»DNA^mmK^I^6, mef^R^L,

#m - ##&#"?-c#to -E-LT^fi6 7 h9-^e#%t&ctcz ^ yemmmcf&mf & c & e g % c, &%#2:i####

yyA###m#ijm%ms /

i). j: o#Lv^^yy m^tzhz.h zmwhtz^o

(#) ;y^ hu-m^ # : t* @± k h y y A#$f t > ? -0:^ : mm m& &&%#%####%# &####& 4>s n& HTB M2 m# gg mfc $iJX tty/ A##ft > 'b&QWM a#:m^c&c#%mr (tt)a^M#pif ±e#%m iWifl i- amman:##) #%^agy-y -vy-y - 7^0- #±m(W ##y%yA#%gg /W/NSAMS&m

mem## 7 7;i/7y7/

:±m ±@ £is B£ #A my##m #m /W tyyy

sm mm > y - /Wf yy y ay ——

me #E n^m % $#m#

## i^m c hyv

/< /f t ^ -f /f *

: (m/<^ t/f >y% h v -m^ $#@# f>y% h v-m^ %#§G^ urn %#

Summary

The Project has advanced to the stage where U.S., U.K., German and French scientists are now engaged in large-scale human genome sequencing programs. The complete base sequences of yeast and several other microbial genomes have already been determined, and new life science areas based on such data is rapidly expanding. Thus this report takes a life and information science perspective of the technological developments needed to apply genome information to a wide variety of industries.

Chapter 1 provides an overview of the genome map as well as sequence and function information at the current level of development of international research on genome analysis and highlights issues confronting the efforts. The information obtained from genome analysis is being compiled in databases which will enable us in the future to simulate diverse biological systems on computers. The analysis of genome structures requires a new system that integrates existing analytical methods and executes them much more quickly on a much greater scale and in a more well organized manner. The methodologies for analyzing genome functions have not been established and need to be further developed and refined. The efforts to analyze functions have so far primarily focused on individual . The new challenges are to perceive biological manifestations as expressions of entire systems while elucidating how sets of genes interact to control the expression of discrete genes.

Chapter 2 reviews the current status of the technologies that analyze and make use of genome information. It also discusses (1) prediction and identification of gene regions in genome sequences, (2) techniques for searching and selecting useful genes, (3) technologies for predicting the expression of gene functions, and (4) technologies that predict gene-product structure and functions.

— 1 — On predicting and identifying gene regions in genome sequences, the report discusses the sequences characteristic of eukaryotic gene regions as well as different approaches and pending issues for predicting gene regions using information science methodologies. The report finds that linear models, especially the hidden Markov model, are most suited to genome model constructions. Differences in prokaryote and eukaryote transcription and control mechanisms are compared together with their associated analytical techniques.

In the area of techniques used to search for and select useful genes, the report describes fluorescent differential display (FDD), molecular indexing and cDNA microarray technologies which are critical for rapid analysis of gene expression profiles. The report covers gene searching technologies with an examination of the positional cloning method which identifies genes responsible for an expressed trait based on the fact that genes lose intrinsic functions through mutations. Experiments on humans and other primates are limited by ethical and financial considerations. However, important knowledge about useful genes can be obtained using mutants of organisms that are carefully selected for a particular purpose. Discussed here are unique approaches using microorganisms, nematodes, fruit flies and mice to find useful genes.

In covering technologies that predict gene-function expression, the report initially reviews methods that reveal transcription-controlling regions for cloning. This is followed by a discussion of methods for analyzing transcription ­ controlling regions. Since gene transcription is controlled by interactions between DNAs and or between proteins, information about many transcription-controlling factors is registered in databases. Many models for simulating the mechanisms for controlling the expression of genes have also been proposed. However, these models are only at the stage of evaluation for usefulness. We still need better models that can be applied to a variety of

-ii- complicated processes.

Since proteins are synthesized based on gene information, the technologies for predicting gene-product structures and functions are discussed. The two most commonly used methods to analyze the structures of synthesized proteins are examined including nuclear magnetic resonancing applied to proteins in solution and X-ray crystallography which determines structures in their crystalline state. To predict the three-dimensional structure of a protein from its amino acid sequence, the 3D-1D alignment method is widely used. The method predicts the initial 3D structure of a protein by comparing the data from the target protein with those from known 3D structures and by selecting the one that fits best. This approach is taken because statistical calculation is impracticable for predicting 3D protein structures that sets of amino acid residues can form. Improving the techniques of gene destruction and expression suppression- useful in clarifying gene functionsis still required. The report reviews the homology and motif search methods which are widely used to predict the function of a protein from its amino acid sequence. It also discusses integrated database systems incorporating data on amino acid sequences, functions including protein sequence motifs as well as 3D protein structures.

Turning to the course of future research, Chapter 3 recommends that Japan's industry, government and academic sectors should focus their leading technologies and expertise on developing new technologies, collecting more data and interpretation in order to clarify the inter-gene information networks, a subject which will be of utmost importance when the DNA sequences of major organisms have been determined. It also calls for more research on gene data processing technologies so that inter-gene information networks can be simulated on computers using protein databases on amino acid sequences, structures and functions.

-in - Chapter 4 gives examples of how the research project discussed in this report can affect industry and society. The focus is on sectors such as pharmaceuticals, diagnostics, diagnostic and analytical instruments, laboratory equipment, chemicals, food, agriculture, animal husbandry, fishery, electronics, environmental protection, information technologies.

The final chapter cites the need to develop technologies for interpreting and using genome information through multidisciplinary cooperation among professionals in genetic biochemistry, computer and information sciences, precision engineering and other fields. The unprecedented nature of this project demands unique research infrastructures, periodic review of project targets and progress while keeping up to date on the latest findings in this fast developing area.

— iv — (:%oa,&a&#"cv'ao coz 9 ^^%Tc*-oT. y v A##eg; < mmcf&mt

#%omfu:f&c-cyvA#^. y-y>xi##, ##m#c^v'-c WmLto y/AM#f(:j:c"C#6fi&m#(±y-^/<-%

6^Ao#^^Tyn-yo^m

^Z^M#OX *cXAd#&&C&&&##%#&#&% W#&#jtL2:o ffltft't ^SS’Cab -S C t ^ b^ Fluorescent Differential Display (FDD), Molecular Indexing, cDNA Microarray tiovT £ t $>tz0 Xtsd^SSS#! t L*C(±,

- v- fCT.

r7n--7££ b&tzo

DNA- ^ j:cT$iJ#$^''CV'&

j: % tx cmmnm&#A&a w±#t f %> o

(NMR)

iDT^/f ^

<7)7 5 y EE?iJt-5\ -t-f-7#<7)##7'-^, ^ >/<^#<7)ii#:#^7'-^#e#'^ A(:cuT$

3& • M ^ t°a.-?±-C"> 5

— VI — ^ i,- j: 9

Kr#. ##»###-#%#%&. ib^c^,. m# (W). dig. #g. %-l/X ha -Xx. ##. ##. ^<7)#(:-ov'-c.

#5#-r(±.

#(:##c»&%&7ta-L. yavj:^h^)^|n]. ^-yvh^^JiyXL^d^]#*

- Vll -

1* (tutbiz -'fsAfmjrmmtmm-

1. &

^"C*6o a L"C^%%(:^ A^###(0±Ty < ^.1%- hf& FLifeinSilicoJ vXTA<7)^%t:c^^&&(D-C6&o

2. vs t*w$Lmmtmm (i) y/Amm yvAmgi(:(i±#

#@mia(±memiaet

mem a : fc h y V A <7) mem 121 £ L -m Genethon WE L tz 6000 i? 1) <7) V >r ? n

/f^T9^KyyX. yg'yyay/iJ:. ^-cv^o at##. ±mm. ##m. cmmm^memBi^yy A@tmm#^6#ef&o Mmia : Egijyn->mia, yrnHW^T-^So ^ h-mvAC ^o->^Ui:lf;fflv^(7)!iai*-h^otv^ 0 BAC^ PAC&fc"£ fflvW>-7^yxfflc^fiSmHW^ift W WTWo *%, vyx, 4 *, #

A, T7tfKyyX#-e(j:y-XJ:>X i/-^JL>X(0 m^cmA-cv'&o ##^&#/x

-l- STS'?-# STS %|g, cDNA^gg^K^iJ (EST) &'7v7LtEST'7'77&2f&Lhy/A;&#t:c,V'

(2) y-^%.>%i## y —^ L"Cy/ ADNA-S-0)t(0(7)y —^ 2: V't)W>6 cDNA

cDNAy-^ ^yxCML"C(iTIGR#&i7^ /!/?# 7y > h j:c-C±m#

y/A(7)y-^j:>x(:cv'T(±, /J\#(DyyA4)-^x^#c>m^#T(iBE(:^ y/A?)y-?j::/Xd(&&gft2:6?X &Cf4:#(:3cTM(f)62>%£.

&o ##y/A&#o*&a.fy a Oy 7 9kf Kyy%T(±Ayy Ay-^J:>% L hyv A-e(±-m»)^y-^%.>%^^@#2003 ^<"3^%#,#: (22#, 21##) "C#24F&i'L3

TV'&o —'?/f379X'?, x^-Ad<##-C&%L"CV'&o /<^TV7y/A6D^ 9 —

%> o (3) ##m%

-2 i± EUROFAN ##%#c ^ bei± CAai-&AmA#"C(±cDNA^##(:Z&M$^^M%, mWS'TyUyycjia^

V'ffUiy v

###mc Z &m#&% ^####^'## h ^ & o

3. y/A#mm '7vU>y&0fi/-y%.y%m#[:ov'T(im*e(M:m^±^'3Tv^ (4-m, fb#%

(:^ffi-6yXTAikm+6^-cv'&o -eu-c^, # ## emATV^o DNA

^ x 9 jK^itS#£tf o X v'h o

f & y X T A 2: ^ 6o C fUi^t A#(DMamfsf #Kf c t ? Tm#g ^<7)y-yj:yx###^#$rj&(:c^T(iDNAy Mimac-cv'&v'o

EM(U; & v x y AfiZ>„

4. 21 memyy A##r/\&UT yy A:+m@#:(±4"m[±c hyy Ay > y-x^xy

'C'CM^f&o L^'L. ^yyaf^Xo

-3- ^C: h C^6 9o 2%afxfArnam&lWB&6,

##&zwr&m-fm^k&&o c(7)Z9^*#<7X)2:.

(1) ±m#y%TAil:t:^'gEa$fL&%# (2) c##ff&## (3) (4)

(5) y V A E^CO^Si^'fju tf - 7^§*5&Mi_'i> 7 > tl° j- “ (6) A##A5%# m « V-^XVT.-£>?-%£ *t Ml % u Sanger Center k > Chr.22 1996# 50Mb (#&) 20 i 1997# 90Mb (#&) 11 1 X 1998# 150Mb (#&)

a

* m Washington Univ. k h Chr.22 C 21 2 #BIT60-100Mb 100^6 EST

MIT k hChr.9 17 d#7 h ffl5kKt>*An, ffiffi%'s*

Baylor College k hChr.X #99 (10Mb/#?)

Stanford Univ. k hChr.4 #99 21

TIGR (j£:S) k hChr.16 2#rsr30Mb@E Aic±;D&j3< Univ. of Washington k hChr.7 #99

Lawrence Berkeley Lab. k > Chr.5 10~20Mb/# '>37^37 •>3 7-73 A

Lawrence Livermore k I-Chr.19 #99

79>a GenethoneMiS SBti20Mb/#£E9l GENSET (A#) (30~50Mb/# ? ) 1996# 8Mb f-M y Jena Univ. k hChr.21 1997# 12Mb (#S) 1998#20Mb (#S) E * Cancer Res. Soc. k hChr.8, etc. 4#-A-#ftr Kitasato Univ. k hChr.21 4 — 6 Mb/# Keio Univ. k hChr.21,22 Tokai Univ. k hChr.6 Kazusa DMA Institute Cyanobacteria Arabidopsis

— 5 — K) hQ 11 tjm # "X6 IM A AJ X G G y A # PS G y G 4 & X ^t 4S # 'A # S fs£J , ■K f i m # # <£ $ G uK X # # AJ •R G w # S i'K AJ # u 35 # At G X s s> # # -M G -> >R y i ti. 3 A ^t3v <# v-' m w H, % C' y X Si K' s X , tA G x- G AJ -R At G m » 44 X Bn YO u -R PS £6 A X G iK G 4tm # JJ tig # __ y ■R # K) 4^ nn X jg # , fj *s 'y 4? BE K k JQ: Inn % <-A A K& i\j 4jj # tQ t<£? G Bn # # S' * § 4* y # is %v 4*\ # 'A y 1 m E fi4> % 4) 4@ A3 •M i§ * 0 liif * X 3 O y H y # * S af t J 114 A y %: •R +6 u $9 ■m- Si *R # G M y 'C' i j X » 4> m X& Xb G i s ~> % A # y F3 -R # AJ y X y -M V At # M M 0 AJ e G % A AJ rf # G to s y -J E "tit # IK e n|r> # % AJ a G G f "R Si tM m w 4f $f X H x G AJ At AJ JJg X ag AJ 1A At tA y % ' 4> # l<3 IS # E 'R m IS # tV NZ Si +6 •4 , G R 4~6 si >J m Bn % X X JA v •* •GJ @ Ss ID If’ y * Si Sr <4 Ifc G s 4 s V # 0 % G 55 4d y G •w A X y —, tig A G A G o 4 •X % SB >J G w y X 11 0 A K3 4* X O X S -M Bn # W- tig rS !U IS 4$ 40 fA 4-> A * 40 * G G •nn y •fr M’ % , 51 h$( 3 A n)o SB # Si G s hh a X G m & 0 iK ■ffl BK X 4/ •X 4 s 1 y sa IK sis •&J tig •R fe tig f J ■|§ ms y # •t§ u 4 A X t) *H y M u i|m K? i j gg , A 41 58 m < iinn I-R « A G °< u iX *3 {if ■ ■y 6 tM AJ X Sr h K E H? % G 35 y G ss 4 s'//♦ Q « X ■nn = e SE •nn y # »> IS O 4- S> "R # A i j *4 46 4 44 £ +< x # 4$ j€ , $xQ Si G tfr fj im li 4: # AJ Si y 40 y # X # A G AJ M) 4 JJ E5 •H Q ‘A Sl D M 55 G 0 #'//* rv> PS BK X 4r A Bn ■H -R « A txQ At 1^ fA ■H X # X W * < Si & tA o u A Bn *t A <0 y # & y X o ■H G S t j , G & 4 w ic < -J JQ: E 0 S bK 4> y # 4 m tA % e K? rS A gE A G H _t H A> A is j- A G imEE •^ iW CM s % f S 0 48 "R V K? y & S' -< G m EBk 4 # X t^

x.Z9o

1-1 m#±# (1) %# &# (O DNA f C & D cm#^# a It^T DNA $ d*m 'o

(2) M® A# (#cm#A#) T(±±# ^DNA K^|J

(DmRNA > ho>, W;$ fif ? >/<^ ® (:a#R$ K&V > a #lf fi&o B 1-1

1-1-1 zfu^-Z-mm. Mittl±RNA*°V y 7--tfkv>^ 7 j: t) RNAC &&mAd'&mtii5^(omtf:5iS]&Tm. m^iHie^manf^o RNA^v ^9 —if^yyADNA^^^^misf *#&#-?#. c(:RNA* V * 7-W&&ir5;it:iD^MMSIr& = Mt>41-AK3n|T&&o TATA^v^

7- XCTATA (TBP) -5-fU:#6#ET (TFIIA, TFIIB#) #mt:RNA^U^7-4f^^f6Ck^#%$fLTv^ (Do TATA^vXX(i^L-Ct#6^iiff^t)ft&^ c

DNA promoter 5’UTR exon inti-on exon intron exon 3’UTR

^ Caprn start 5’ splice 3'splicen 5 splice 3’spticerrT stop poIy(A) site codon site site site site codon site E i-i mm&swmaf#m

HUMHSP90B: CCGTTGCTTG GGTTCTTCCC CGCCCCCTCT TTTTTTCGGA CCATGACGTC AAGGTGGGCT BOBACTSA: GAAGAGACCC AGGCCTCCGG CACCAGATTA GAGAGTTTTG TGCTSAGGTC CCTATATGGT HSH2B2H2: CTGATTGOAA GTTACTCAGA TGACGTTAAT ATCAATATAG TCCAATCAAA ACACGTATAT

-HO -51 SUMHSP90B: GGTGGCGGCA GGTGCGGGGT TGACASTCAT ACTCCTTTAA GGCGGAGGGA TCTACAGGAG HOMACTSA: TGTGTTAGAC TGAACGACAG GCTCAAGTCT GTCTTTGCTC CTTGTTTGGG AAGCAAGTGG HSH2B2H2i TCAGARAGCT CATTTGCATT AAGAGGAAGC TGCAAGCTTA GCCAATGGCC CAGCTTCTTT

-50 -1+1 +10 NUMHSPSOBs GGCGGCTGTA CTGTGCTTCG gCT^^pE GGCGACTTGG GGCACGCAGTIAGCTCTCTCG HUHACTSA: GAGGAGAGCA GGCCAAGGGC alasaaaiCCC TTCAGCTTTC AGCTTCCCTGIAACACCACCC HSH2B2H2: TTCGCGCCCA GCAGCTGC^^M#@TGCGC GTCCCTGTAG GTTCCTTTCAjCTCACTTTCT

; : CAAT box WMM TATA box mRNA transcription El-2 7°ux-9-Mi&

yot-xv )vffi^jc«tobti&tit, -?u-t-x

i> '> X*T^gd^ij ip-fn ^e — X b LT<7)

-8 7,3)^ ^yf;i/;<

T. TATA^v^%^)je&a^^J[:#Bi-6h. HUMHSP90B -eti TATATAG> HUMACTSA t? t± TATATAA> HSH2B2H2T*{± TATAAAAt^^TV'&o mRNAK#^%S;{&^&$aL^TATA^vXX HUMHSP90B"C-27bp, HUMACTSA ~C-30bp* HSH2B2H2 “C-32bp fc * o ■CV'io Hi:, HUMHSP90B "Ckt 1 cco TATA^ v X X h 1 cx) CAAT # v X X, HUMACTSA *C* t± 1 o

v'9m%&#oo 73t-X#%(:^*^v^T#x.&cat&6o 7nt-x#%[:

Tm (:#&f mfRf (7)@m U j: c T. 7 n t - X #%<% GC 3"# & (D

7>/\°7fC=i-FpI^ dd-eti, 3-Kg#£H|j&3 K>t^3 ?x/c^ai-&m#a^(i ;]/a^j(:ZcT#m#U6^6o %.4rV> mR##?)yyf;]/azm:j:c'C #l%.jrV>(D5'*j@^m^='KX"e3'*%t±5'x79/fXgB&. ^ ^jL^rV X±^^fL3'x79/f xWk 5'X77T Xgg@. #^4rV >(7)m%li3' X7?TXggf&2:i^%3 K>?&&o

3 k > \zum-th 3 - KMSsaa?ij<7)Eitto^ii 9 ^ssm 3 s$mi4-c\

&AE*mf^B(:#$$3'a&&xx#(DT

— 9 — m^3-Kc^u-6im#^3K> cz^-c, ^m@g^<7)mN:&*m$^'cv^ca^ai6fL'c^6o imm§3K>a(±, 6&lo<7)Temi-0 GTT, GTC, GTA,

GTGii^V > (valin: Val) tl.S (fcl-1 : HU iHliENAMf

t ^ /E(±3 K><7)S2SS$'e-eM$n^o zvtztb, w>3Mtiitemgs mir f & C ^#3mm<7)Gcmm

#&#!#(;<&&o (i, AT 3- K##P)& h &o 2 W:, N#^3 K>(D3^/^-X±A%@(:ZcT#&C2:^#^3^Tv^ "o

at, v^>/^®e3-KLTV'&#%(3^ etui, -&#

^R#m3K>^miR3fLTv^^*-e&69o a-KW&myffJi'BEmckL m##

v>t LTtiATG, »3F>i LTfiTAA, TAG, TGA/5^nbtlX v>S (# l-l)o ^m3KX:l±, ATGW(:#-o^'7vff-mm3Ky^f)^"C^&»)o 1-1 jlfe3— K Second base U c A G Phe Ser Tyr Cys U Phe Ser Tyr Cys C U Leu Ser Stop Stop A Leu Ser Stop Trp G Leu Pro His Arg U Leu Pro His Arg c C Leu Pro Gin Arg A First Leu Pro Gin Arg G Third base De Thr Asn Ser U base lie Thr Asn Ser c A Be Thr Lys Arg A Met Thr Lys Arg G Val Ala Asp Gly U Val Ala Asp Gly C G Val Ala Gin Gly A Val Ala Gin Gly G ^^3KXD±m^z^Tm(7)23m#(:A^"c, #mcmcTv^ch^#^3frcv'&^o mm3Kx:iGc<# 13K>mmmmmcmo##&i-&c2:&mL"cv'&t*,

-10- (RNA7794 9>7)

4>hax7)^ (RNA7 79 4 9 77) CdZ'#g2:$fL"Cv^6(7)"C&6o RNA779 4 ^BE^iJkL-C. 577947 gm#.Cf3'x794%gmdi&&o 4 > haxDW% (5'^j:Cf3'7794 7@;@(7)##) C(±, 5'GT - 3'AG 3 >-fe >4)rxmU*%\biiXi5*) . GT - AG )V — )Vb£ l£ti mAi^^Tha>'r(ihA,a'R-''r&&o 4> ha7^^<7794977K^(:^#^m&o 3'77947^{&±mcti, 14V

< '^o

C(D#, 3^7794 7gg@±#<7)4 7 1 a >#&#(:#, 779 4 7B#ti5^779 4 X

77947 % <&^#<7)S/77;T/aB^i$rai-3(:^i-o ^-(D&@cj3V'T. ascmmf&ct&^u Nii, Y#i4v <97%#^ mu. zY(±ctemL-cv'

intron

5’ splice 3' splice site site 01-3 77947 mm## 9 ri-mm

K#l*

&6<7)2:#x.6fi'Cv^o 2%, 3-K#*C(i, 3cm3K7(:#-7v^^*# (readingframe) 3-K®«©MMli3»M(7j:otv^0

? - 5 * - 2 - &ugk u a

-11- DNA RNA (Cl <7)Bf A(cislf& RNA^hT-iiprimary transcript RNA t H? (i'fi, RNAxy^/f ; y ^^^^(iRNA^U ^9-4fn(:j:^-Cfr6^6o W. 3'*^^CA"C(±&^ IzTffitV&WlimitbtiZ (RNA*°V y 7--tfII<7)read through)o RIIUoT# m^(7)RNA^f(i. (mRNA<03'*38 (^UA#tn@B&)) ##A(7)10-30bp@±m(:#tf&AAUAAAhv^p n >4z>-4-% #KrA<7)Tm(:6i,UG(:#tf#% (Urich^^9%^-hG^L^$m.6#

#) AAUAAA@^j[±^#m ^^UA#AnKf5(D#ati)^cT^i)^\ ^VAyyf;i/kie(fft&o ^uA^yf^

>-V-%@d^|-r6&o #-ik3 KXDTm"e#%(:mt»fr6

AAUAAA/<^->^4

^JE$tL-C5£C/"c3'OH*^(Ct± polyA # V j* y - -If (z X o X 100 ~ 200 ££ ti t'

&o ±mc^L^<7)(±, Rh^(±C*2:L^l#$Lm(7)K^Jm#m-e6&o

d!mma6A?&6o %^m3K>(7)iaeg^J(:Mf6AimhLT, Kozak;i/-;i/^#^

-12- £ Wt

1) Aota, S. and Ikemura, T. Diversity in G + C content at the third position of codons invertebrate genes and its cause. Nucleic Acids Res., Vol. 14, pp.

6345-6355 (1986). 2) Brinstiel, M. L., Busslinger, M., and Strub, K.Transcription termination and 3-processing: the end is in the site. Cell, Vol. 41, pp. 349-359 (1985). 3) Bucher, P. Weight Matrix Descriptions of Four Eukaryotic RNA Polymerase II Pro moter Elements Derived from 502 Unrelated Promoter Sequences. J. Mol. Biol., Vol.212, pp. 564-578 (1990). 4) Fickett, J. W. and Tung, C-S. Assessment of Protein Coding Measures. Nucleic Acids Res., Vol. 20, pp. 6441-6450 (1992). 5) Ikemura, T. and Aota, S. Global Variation in G + C Content Along Vertebrate GenomeDNA. J. Mol. Biol., Vol. 203, pp. 1-13 (1988). 6) Kozak, M.An analysis of 5'-noncoding sequences from 699 vertebrate messengerRNAs. Nucleic Acids Res., Vol. 15, pp. 8125-8148 (1987). 7) Lewin, Benjamin. Gene. Oxford University Press and Cell Press, 1994. (# mwt r#fzf#5mj 1996 ) 8) Sazuka,T. and Ohara,O. Sequence features surrounding the translation initiation sitesassigned on the genome sequence of Synechocystis sp. strain PCC 6803 by amino-terminal protein sequencing. DNA Res., Vol. 3,

pp. 225-232 (1996). 9) Shapiro, M. B., and P. Senapathy. RNA splice junctions of different classes of eukary otes: Sequence statistics and functional implications in gene expression. Nucleic Acids Res., Vol. 15, pp. 7155-7174 (1987). 10) Sharp, P.A. Speculations on RNA splicing. Cell, Vol. 23, pp. 643-646 (1981). 11) Snyder, E. E. and Stormo, G. D. Identification of Protein Coding Regions in GenomicDNA. J. Mol. Biol., Vol. 248, pp. 1-18 (1995). 12) Wahle and Keller. Annu. Rev. Biochem., 61, 419-440 (1992). 13) Vol. 37, No. 10, pp. 952-954 (1996).

—13 — 14) Wahle and Keller. Annu. Rev. Biochem., 61, 419-440 (1992). 15) Yada, T. and Sazuka, T. and Hirosawa, M. Analysis of sequence patterns surroundingthe translation initiation sites on hidden Markov model. DNA Res. (in printing).

1-1-2

(1)

&<, z o r^-7»tHj > wn-cv^o tupie#$f. 7#m.

)V a 7 A:fOV (Hidden Markov Model:HMM)<7) i. o t°)Vv»6^?i<£> 3 ■oil

K - tuple###, f f-7 Wi7 - 7 vt± Efrthetoa4-&#£(7)SStiyEetig u @^ijEM-

#+#m: &&#&### gaRm*T=fV-C#9 Rcai%i-&:ef(2) i«$Sg«*r=f'J

&. (3) ^iS$tvrv>^0 tztzL, SSto&K - tuple Mr "Cii.

r-? (Dtyb r&^£#li-t££:*6e{±. ltl)0^9 art:

&#& 6 t(Dt u-cm^

§fit Wtt 45tifcfitMMMU t frta-to ztilzj: 0 . -t^-7eKcr -y<7)^6#"e#aL^±-c(A^f—7rrn-c^ma9^h:#ei?oc

-14- $W:, me%7;i/d"vXA(7)j:9^#$%

"Cv^v^ -me. ie^-tosm(±. tm^emme^f #%"C#f. M-o-flSeE^^'^^c * v acaez 9 .

#o)%^-^±%ez0E^j^ae#8ijejt#i-&chej:o.

a. K-tuple

v'fco MJU£. E 1-4e£$/ b^u^cnr < 7 a&KPJm-a&^fo v h 7 nA C®#^CIi, #&flie&®fcffi?!l»rfr t Lt cxxcH (^^L. x(±e^<7)7

t±. El-5 KyjkirXd t) . y F ^ nA CWffittC^i

£168 Vt-ynACO 75 /$KW<7>-® Human ..FIMKCSQCBTVEK. . Mouse ..FVQKCAQCHTVEK.. chicken .,FVQKCSQCHTVBK.. Snake ..FSMKCGTCHTVBE.. Prawn ..FVQRCAQCHSAQA.. Yeast ..FKTRCLQCHTVEK.. Hemp ..FKTKCAECSTVGH., Tetrahymena ,.FDSQCSACHAIEG.. Rhodopila ..FHTXCILCHTDIK.. Microbium .-VFKQCKICHQVGF.. Pseudomonas ,.VFKQCMTCHRADK.. ff— 7 ..xxxxCxxCKxxxx.. El-4 yt?DAC07

15- Aitken <7)iS5(] 4Ef- —7 Aitken4 3&1*S7 5 ;Wt&W'*?->K' lo'<'X£ tibtz Uo fztiL, Aitken tcov* Prosite(iEMBLZ h

%> o

BONSAI

BONSAiAB#at&m8K7#tnI@B"C6&o ## KJ3 V'T\ Kyte and Doolittle ©BMctt t>f7H (:#.% 75V M£iJ<7M > r y

Roomanh Wodak6(i, 7 5 vm^amKrM-^amf em<7)2o$^(i3o<7)7 5vme#^L^^7->^mmL,

37^iA'$->

smith&i±, 30^75 ^-7coJiiUB^^fo* 5)o :o#S-e, 33 #(7)^^##, 18 «<7)DNA integrate, 30 # 4: 2c^)7 5 40(7)7 5 -7&jEo(t&C

b. f-7%ai -et±, tf-7$-gmi@r^^^maR(7)*Td'v dfrczix tf—7(:%#%m#&^x., t

-16 &am-c#a. $6c.

(2) ##-#-/<##& (#%.#. yh^aAChv^@a@m*T=fV)

*T^V) ^'fL6-&Ltv\ c^ch(±. #A"C.

tf—r%(:iELv^&

%)tf--7j -^iHitsitttfflU'C**}. sVdMiz-^z.bfitz^yT ’frr-tfrb r«t 6LV'(#$M)ff--7j LTm7f'^&6o r±6 6Lv'#$%tf-7j o^LT. fi-cv'&o f##$-cmmLt&#T&ac AMCIi. ag3FiJ

MDL## mm&7E')* ii:± f). r#$%tf--7j nrmat^o ttL. -iKC. & h#cT$&i(D7'-? c#f & & a ^ 7 im@#& & o eg A . gn-6<% y;i/-yc#LT. r#^<. -E-7-etitfuf.

-17- vo*e WWftf K«#f* inti *V* «fi? «*• *.. ' 01 *w •* F<- eu fin nw »-< 0U- fftta *W O* ^r9 09t •v» 0** m*i El-6

&o ClfU:#LT. -# & ^^mm;u'c&&o ^mm i~%>'1]&(D^'3 b LT> Hd^ESS-'J'(Minimum Description Length:MDL)S¥71 /$?£n b tvC V' %> o MDLSf«fA^(i, -E-tf)SMStmT<7) J; 9 12 if y b<%## “101110111010” ^^#6 ^tt^^v'kTv h%-e#kmf&c:he#A&o co##&f?)$2mfufi2Cv b#m-c 2CT. tL, ECi##e. mm ^ fomo-cumm? # if v b %-c^ mT#6C2:4:^&o mAlf, ±Eeil(i, “101* ” fMEomOiEL. &2fL. *(^)gG^(i “no ” o&tfyb-cm&mx.t&fx a#x.&chM#&o cm#&mu&#ifi'&m&

##(%]&# 1,0* i:iifflft&'l?&<* ri'o,

ct##^mm^$x.TtmTf v b%(±^^< MDL&$c mu&#%f&&&c#g&tf v v b %(D% & #/j'ik-#- & a 9 & mu t m $ L v' mu a & & <, 2^MDLm#(D#x.^em$%tf-7mm(:mmi-^amTV)i9 c^&. eux.wr.

—18 — y 6B8C, FCXXCH2:LXGXXXRe#Xlfy FCXXCH2:LXGXXRaGPXLe#'C(fy|'^oAC'e6&J tv\9 2o(7)t f-7=m#6fut:af&o emmmccimL'Ui, -7^7< ymK^iJAW'6y b -c#&„ 22-e. W/<-t>hlEL<&M'r&&#Bije#LXkt#(±0 1:&&o

h%-c^#ibf6ca^''e#&o

(krvi').

342 {\£y H> -^> m%

##^4ooif v h. m#^404^ v h if«Msi i)II Li'tf “71 SEtiWUd'UXA

zmim@&#&f6:5#aL-c ^Rm^m^A*7;i/xv XA(/)Z ^

c(DZ9^#^c(i. mf^^T^xvxA (GA)»)#?)#mmG& mv'&zaej; i). GAti. & ri@#:j ##:^'e#^gg^%eAfL#x.& ra%J. & rm^j #&#mc

-19- -oz-

3B¥mmm °4-^WSf?t:lWWHa»4-^iOYa-f'l

liIMH«^0VO^>K‘ Z-T0 9Z'0S 910 S StiOV 66'0 99'0>t 00 I- tfO ezo d VZ'OA X. 00 LH txyioSoon 9sov ooi. MOV ee'ox.oo'to o c> — ‘ -o 00'o'ti xv zz-oa 91-'01 oot \ ee'oi 00'l ITO© ero

<-i] L-l M °??a&-6KUi wwh 1/^1 £C1fAl|:H

4TO2)iiMtox -v 4-4fii® "0 ':)9 3 °9#J.iy3T94#2#a.$BWS&

^-64^4 p-^mc'Tiw&i-i^i ':m- °?4-if? X-^3r4im3 4-M ^ tSEWW '?) £ -^3r £T 2Mf43r X n ^ Z-lffi mW/1-43r§ 42)1^9: O

VO °^ T4^^o-a)^^SS^nE2)ffltfl^-'49r?]V9

4-&14-4<( xx °2n4%f $##W?

'?)3W#VOW41:?TXfi;:^m#24l#:)miM4.-'4ar '1^^ if 9 ? 7 9"W= 9 ^ 41 #:)% &o ctDh

HMM"C(±.

^-7#mt HMM j: 9 4\ &2>®m~mirZ>Mfrm!lris

f&bt), #@tvh HMM (I J: *) *TWt1r& dt &V§ -So HMM^f-7

&o #$^^^<7)4:^ mmm*. #Bm*.

v h 9-^(NN)#(7)#(7)#$f

Tv>6A^$v^ a/fi/>i/ v/<-<7)e!l'e#^f&i9(:,

^ ^ 9 %@(±# 6#mf & AT#

HMM C Z&fjm#, #£<*>* r =n; Izmir Zib'Zi

h'fi^:(fm'C^&^2:'9^$'mi-K%a#x.'CZV'o HMM(T)##t7 H:

d y 7^-tf-

&o

ay y/^-tf-7 (Dtiiti

HMM LT, i®?)i:un>f y >y y/<-t^-7(7)m(m:

"3V'Til8^i-&o #m(7)HMM(7)#W#&^(±. HMM

<7)b^ai;-^^AT#@i-&ac:5t:#m^6&o dfu:j:0.

o/fyyyyX-kL DNA#-&@B#&# o*&#<7)g8@(:ai%f 19 x 01-8 (:&& j: 9 (- a ^V y ? xSlBUis

V'Ta/f y>(L)^7cg#cmmL. 9 ^yv/<-( 7)Z9 t:a^ y>^^f&d&

CA^f&o C(7)j:9^a/f y>y v/i-^#ogG&<7)K^m#e#@t y

-21- @ 1-10 o J: 9 ^ « &#%9#acav'T#& a/fy>(L)&&v4iXV>(V) hi'c^#7K#<^7< V%^^rin](:#^o"Cv^(7)^t)^6o C^ZotzA'i y Xx# 3iii [helical wheel] t LX'i&%frb%lbtiX^'Z>i)\ HMM £fflV'£ £ t tz X Y), @#

x-XA-x^&E mAM*m^z>tm*mmb¥m^v v^wt

Izmir &r- X A-X cDtkmzn 0 1-9 It HMM rmm L /:nly> y

t. y b t^itznmy v/<—tV'9 3X > b^#tO$fL"CV'^v\ KXiJx^tKi rJUIKin^yyy-y/H UEEi"^ (3 b°mim^ < k v i #m, 01-llti i* fc, x V -y X ^ - ;v K \z'^ ATF6_HUMAN b J£ At> ^%/7_ZfHMAlVC[±mvK#7 < #:^[:A'1 -yXx#me ^<0^v^MFC_AWMC4;ii#fLV^helical wheel#me t otio 9 , n^y>y y

Observed from. . above Leuc hr

5 2 Basil Hydrophilic

@1-8 n 4 > > v y/<-#^@ @1-9 o^y>y

-22- 59 17

@1-10 D4 y > vys'-Zf-yvmmm 01-11 n4 y >y>ri -7 4 m0gM #;vi

X ffi 1) Aitken,A.: Identification of Protein Consensus Sequences, Ellis Horwood Series in Biochemistry and Biotechnology, (1990). 2 ) Bairoch, A.: PROSITE: A Dictionary of Protein Sites and Patterns, User’s Manual, in SwissProt Database (1991). 3) Shimozono,S, Shinohara,A.,Shinohara,T.,Miyano,S., Kuhara,S. and Arikawa,S., An Approach to Bioinformatical Knowledge Acquisition, in Hawaii International Conference on System Sciences, (1993). 4 ) Rooman,M.J. and Wodak,S.J., Identification of Predictive Sequence Motifs Limited by Protein Structure Data Base Size, Nature, vol.335, (1) pp. 45-49

(1988). 5 ) Smith, H.O., Annau,T.M. and Chandrasegaran,S. , Finding Sequence Motifs in Groups of Functionally Related Proteins, Natl. Acad. Sci. USA, vol.87, pp.826-830 (1990). 6) /bm# : Z vol.8, no.4,(July 1993), pp.37-48. 7 ) Rissanen, J.: Stochastic Complexity in Statistical Inquiry, World Scientific, Series in Computer Science, vol. 15(1989). 8) Goldberg,D.E. : Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Publishing Company, Inc (1989). 9) Fujiwara,Y., Asogawa,M. and Konagaya,A.:Stochastic Motif Extraction Using Hidden Markov Model, inProc. 2nd Int. Symp. on Intelligent Systems for Molecular Biology (ISMB-94),(1994).

-23- (2) yyy^am&miLtmfr fi. ##$tL"C#Tv^o 2ft6<7)imfzf@^J(7)4'[:(i. ##(:-ov'-CI± *&m6o6#K##i-&o iz#>P)^-i)H'' 0 & 4?-X.& ##<&## (DDko'0&6o mTC, DNA(D#^K^J, K^iJ(D?mi#&[:'OV'"Cl8 ^f 60 a.

/ A @am' & y yf a,E9 J&& c k y v A @d^j^^^& cm#-c&&o

C Z CT##Gf(f 6ft&. 2(7)K^J/<^LK3FiJ^6 y y f

MkLTkL RNA%y7 4y@B&^##^A6 ft~C V' %> o J^TC, m*^J^#mL^yyy;i/K^J(7)###&(D##MkLT. RNAyY74

RNA%7°7<%Sm(D#^ (M. B. Shapiro and P. Senapathy (Nucl. Asids Res. 15, 1987 7155-7174)) 2<7)^?£'Ct±> E&t69 5'X-/7 4 Xg|M^ 3y X7°7>f XSlS<7)E^Jy-^^S* (20 l&£ "C i± GenBank Release 46 £ fijffl L T i' & X ^M'fUZ^^X, &M2'kizSX 2<7)m$-m*ff^jkLT. s'

Po (1) 5'xy7 4 xM<7)xny!t# 5'%y9 4 x@g@:(DX3T(i, rn$

%) o score = 100 (t - mint) / (maxt - mint)

-24- mint : (8mmamgB#K9iJ(0&#@#a@(7)/<-t>T-y#'a#) maxt : (8mM<7)@g^Sd^J(D#{&@^#^<7)/<-4:>T-yt>^XR^i (NCAGG) -c, #tx37^mi<»&8®#^#9o tzfu 3'%y7/rxgG@%±

to 3'%79 ^ 37(±mT(7)^'e#+#$^6o score = 100((tl - 11)/(hi - 11) + (t2 - 12)/(h2 - 12))/2 tl : ^VVyf-##f-yC>'&#+ t2 : 3' xy^/fxgBti^^yty-ihXK^I (CAGG)

11 : i^V < y>V vf-#%<7)10#^rf. 8##^# 12 : hi : < y >V ^f#^(DlomBfr^, h2 : n >-b ±1B (1) (2) (OX 3 7#t#(: Z 3 7CZ 0 , X79 ^ X@M&(D# o b. 7

acto. wm?>/

KL"CV'& ^

> b cisJSt, i z>Jj&\Z'D\'X'&'/ > bCZ^t^f

-25- (D. J. McGeoch (Virus Res. 3, 1985, 271-286)) Ciwjg (4, membrane-translation 9

m^^Ti^-to 1) l££n<£>membrane-translation 9 114@#<7)N

• Position 1-11 Hab-5> chaged residue(Lys, His, Arg, Asp, Glu)C# < position 12 ir b TM&'t & 4 b C##T b o CLO^iSC 4 t) ^ 114 Bd^lj fp 113 8d^U (4 position 12 6 $n ^ 6 charged region (CR) £ uncharged region C5K 4 <5 ft/t (1 Sd#|{4CRtfs 17$IJ?£ Wl-St/CH^o £r)o 2) _hBH7 7 4 > h /ieb> Bi^Qco membrane-translation ^ Mtcov'fjy. T<7)SS^MoGtijL^o

• CR <7)fflE<7) net charge t4> -lfrb+2 ^'liCiloti^o 4 OJimfflto %#^#(7)aE3FiJ$r#c ^#^"C(4. -2^ + 4 CM LT • 12 - 19 #B OtSliiC54 charged residue /5S^< Mt> • UR7> 12 - 21 #B ^S^KttVtSEa1 !* x non-membrane-translated ? > ^ 7 Si

• UR<7) 12 - 30 # B <7)Fe9 <£> 8 H<7) peak hydrophobicity £ UR <7)6$ £ b-tirfc 2 {ATcplot £> membrane-translated ? M £ non-membrane- translated 9 y ^ 9 M £ WFS C 4 < F< g!l "C § £ o 114 b "b 110 iE£!j(96%)/i\ ’’ membrane-translated”

-26- W(a,i)t±> position - 1 t + 1 ShBrS'f£<£>£lI -7 -f > / > l'"C(D'eS > h N(a,i) (##i<7)^&a^)%)

ft-fficOT < V> bO#l##T#!l 0 ,

W(a,i) = ln(N(a,i)/) X(:*T6#iE(D^:A, T < /#* 9 > h"? h V y T<7)0S%i±> SUDS^Et-x 1 position -3 b- 1«0 ii^'s i±, iloMSrfirOo #JDS<7)S(rx 1 .

<^A@gM%N"C#!l&o W(a, - 1) = ln(l / N) if N(a, - 1) = 0 £> o b i) fc> D -b 9 &5MtfralH£{±> (#/E<0) SA-fr?!! b > -?■ tL-?'fu7)position (?) S<£.?>nlH\ S(i) = W(ai - p, i - p) + W(ai - p + 1, i - p + 1) +.... + W(ai + q, i + q)

(i-p^i + q^ ’y 4 y Kyips-e^tfit) czo-c, K®ose9iJ iS-x + -v >t&b t

S(j) - max[S(i), i=l - p,...,L - q]

at^u MfrtmzmmLtzm^, p = -12, q = 2

tZo < ^-x. c a c z i). y gB@

(1995)

2) 9 b V > - DNA (A#) (1986)

(3)

DNAOga^iJ t*- ? M$$rU LT 730M(GenBank rel. 98> 1996¥l2fl)> * >

-27- (PIR rel. 49. 1996^6^) ^ - ?/<-% &:##$fi-cjy 0. DNAK^I^f -15^ "C#h V'9 ^-x

Haemophilus influenzae u. Mycoplasma genitalium 2). RIF? >##m4?av ABB #15. ±mm^Ay/A@E^JtGenBank(:#m$^o (:ov'T&. CerevisiaeCK#. 4"#% ^motC. #A. yO^xaXa^m^y/AK^J^^^^fi

atoy-##cz i), AKpjki&#:6&w±gG#M&mm#&#

D. j: 9 ^ h m? § -l)o H. influenzae

b) Mycoplasma genitalium a) Haemophilus influenzae 0.58Mb (1995) 1.83Mb (1995)

identified 470 ORFs 1007 identified hypothetical 317 \ 347

mi-12 m@y/Amato

m9%5mm%#dsatny-Wf[:j: 20%^f-7/<-X[pmlRm^ >/<

##C#LX(±. 68 %mORFm##

A KPJ3 - 6H#mz@%ds. a t o c z ^ x 0%r

-28- a. *4=Py-»f;i DNA(Dm&SB^I2f9L. 6&v4i^>/<^®%7{Smith k Waterman t X oTJ^pjrto^Si&SSSiih LTt&.B.$ tifc 4\ Si to St® ?i (Dynamic Progamming; DP^) (:j:6&Wsm$(:&-3'0,'&o @t. mtoSt®&(:*cf<^^[±:t##^^It

^DNA66V>(±7< PASTAS BLAST^^t^tt&dk^t^ t^t, dttG ^#&(±#tostm&z g;#to&-Bc&m%A[:m%&

SitoSt®&, sut, Smith-Waterman &C <£ %>mmimmWM*m M3 t X «J Mm i-^>8) o cent, PROTEIN h vM 1C^n k PROFILE k^o %¥?>]£&&1rZ>o #to

-e

t T v»a ##& & w±# wxomm# e & #-& c a % ^ 7 #12 <%# & & M tpffl<7)J§&lzi±0 ZkZo {ltdd-et±MIft0J3&cD T. —^>/<^%(D7

-29- (a)zz3TT-7n' P R 0 T E I N F T Gap Len

P 5 -2 -1 0 0 0 0 0 0 -3 -1

R -2 5 0 0 0 0 0 0 0 -3 -1

0 -1 0 5 0 0 0 0 0 0 -3 -1

F 0 0 0 -3 2 0 0 5 -3 -3 -1

I 0 0 0 0 -2 5 0 0 0 -3 -1

L 0 0 0 0 0 0 3 0 0 -3 -1

E 0 0 0 0 5 -2 -2 0 0 -3 -1

-1 -3 P R -2 2

0 10 10 12 11 11 9 F 11 12 1 12 910 9

11 1112 710 14 L •3 4 10 1016 912 1310 17 E 13 1312 17 4

PROTEIN III I P R 0 - F I L E M3 Smith-Waterman mUWmrMM

A R N D c E G H I L K M p p S T W Y V B Z X 4 -1 -2 -2 0 -1Q -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -3 3 0 -1 -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 -1 -3 -3 4 1 -1 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1 -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -1 -3 -1 3 -3 -3 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 5 0 -2 -1 -1 -1 -1 1 -3 -1 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0 w -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1 -2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1 -1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1 X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1 8 :TZ/V7#>tf7Z/'<7*>ai, Zrf/trfi i s. >m, X:*# El-14 ^37f-^«W (BLOSUM62)

El-13 (b) h V 7 ? &o tbfrh-Zo ov h V 7 ^x

-30- V HJ 7 ? mzMl&'t & jest’d L Virtue LtJf^x a Tt\ £±<7)#<7)£T<7)X n 7 ffL^K^lCf-vvygrAfL^tODT, #±, &#So X 3T(7)#± Bll-13 (c) PROFILE^##(DEe^V'-C(I(^#"emm^hfL&Z9 KMR"C#m#^tV'@g^"C(±'!-^x=r7(i#(:t&^'e, x LTv^

&o mh, ^<7)CT*et± profile «e i±^^ffl«i**BiaBa9ijiw^jti*v'c:ic^So -^, FASTA"C(i, <%-3<7)Xf #-Xf vYT(±, afm$k<7)^^J (k-tuple) AT^mCf&o -E-LT, WsKfJ &^T(7)k - tuple(Dmm&#^f macmAi-6o ^WSti, -iWA?

^Jjt#T(ik=4-6, Tc, f ;^T^f ^#aLTtX37^#5< #%XTv7 -eti, caL-c#6^t^&*<7)*uf ?

S^If ■? o BLAST?#, ^v^Lt^ m^/<-xt:Mmcm^L"cx37##[ee9ca(:z()mw#%e#mf&o tt, c

L^L,

BLAST, FASTA, Smith-Waterman *£<7)®?& & #\ ##

-31- x X jesje^auar) r/q^ggo)^^ '£) ouiio;s ^ jap^ug °

Y ^ % 4 gr T :)##^1B#^ ^@4^/ 4 ^ Q $ °8i?^:LTf33mS###m-/3at*:yi##3^ Ar^o*r^@g@/ p@vNa

?iVNd0-4fv/l [i£BB$/ 'Till-A ^ ± L ISV79 VY ? Vox ^ma^gg#/

33m!llWo> ? '9: p@o)%i °9#a.*#^j_@%#%i#?r?aisvaa vNa '#%&a34W4m3im#:L^;w#m(m#9 \Hj.xisna 'Y#:)^B@VNd ^disna?<$j.< Ex_K3$ qrr^gger t^c^isna "?;:)*# °9^^m'?(r^^#4m?r^QE9 4a 'QEs# ^ccmgmw-^MpavNa K^ri^evNa °9%=)3T9^3@#3?ri42g#/szwTwm:&:?ri4?@a@/ '^n^#:)#/^^:m/^^#^isvia

'Pi

'0.97 ?

m

,ifiaT

2(D#m^59#m<7)mf5f (^^v>%385) %&22d'63K\ $d:?#|T#^d'-od:J:XV>^m^27d'619(:, $6Cmmx."C?- iM L/2^X V 57 d'6 17 C#mf6 2 t d*T # d:o DNAK?iJf-?&[f7 < #d:C^$fid:yy A@g ^Ijd^, f-?X&M6d'<7):ttoy- &#-#-6f 1- %c"C, i:]#S$6 6^)h#x.6o #C. 7

X wt I) Fleischmann, R. D. et al. -.Science, 269, 496 — 512 (1995) 2 ) Fraser, C. M. et al. -.Science, 270, 397 — 403 (1995) 3) Needlema, S. B. and Bunsch, C. D. :J. Mol. Biol, 48, 444 — 453 (1970) 4 ) Smith, T. F. and Waterman, M. S. :J. Mol. Biol., 147, 195 — 197 (1981) 5) Pearson, W. R. and Lipman, D. J. :Proc. Natl. Acad. Sci. USA, 85, 2444 — 2448(1988) 6 ) Altschul, S. F. et al. :J. Mol. Biol., 215, 403-410 (1990) 7 ) Brutlag, D. L. et al .-.Computers Chem., 17(2), 203-207 (1993) 8) Luethy, R. and Eisenberg, D. “Sequence Analysis Primer” , pp.78 — 82, Stockton Press (1991) 9 ) Gish, W. and States, D. J. -.Nature Genet., 3, 266 — 272 (1993) 10) Guan, X. and Uberbacher, E. C. -.Comput. Applic. Biosci., 12, 31 — 40 (1996) II) Kasahara, N. et al.: Genome Informatics Workshop 1996, 202 — 203 (1996) 12) Hiraoka, S. and Nagai, K. :Genome Informatics Workshop 1995, 152 — 153

-33- (1995) 13) Snyder, E. E. and Stormo, G. D. :J. Mol. Biol., 248, 1 — 18 (1995) 14) Xu, Y. and Uberbacher, E. C. :Proceedings of Intelligent Systems for Molecular Biology 1996, pp.241 ~ 251 (1996)

(4)

<7)^&tzA# < 3-f^f T, (l) # @#E#<7)#m. (2) Z(Dl n tcLTtti^fLTtn-fM XT A^E-o-t"

t a <73 |§|g tz [o] {f tz t- If tz t> ti X v '& D

(i. a Kyrnmammmm&r&mmf tzl'2)o wjzmm&mtzanx., 3-y^>y#%-c(± x y° y 4 v > /®t^x X— b *5 J: yx h yfa K > (DMfrii & 'iiM, ~f n ^ — X l± f f - 7 M (7)^#^ t f - 7 B^J (7)^^%*^ t»-± ^ $ ^"C t'

"C6&0 -~B, #Em<7)E^ti, 5 t ^-if IEX^6<)^^T7VtoSffl^S$n-C £ £c ft# &<7)tzt±> Eft'ty;v 5~7t zz^ — yjv^y f 7 — X (Neural Network:tlTNN) 8 ~10\ EiLv;i/3 7^r7V (Hidden Markov Model: &THMM) n~13\ &£*14\ ±E<7)#E#^A*y-72:L"C, #ES<7)E^^ML-tv^0 o£ t> >

T ;v tz*iS L * #i@fb7 =f') X' A tz X o X> #Sf tz|ij L £ * T ^ y * - X <7)1:M

C(7)l#. ^<7)#

-34- tl&o yy Stotf@& 16) (Dynamic Programming: D.T DP) IrS DPtrli, ±E<7)ty;i/[: j: 37C&-7 V'T. DP 3-y<>

f&c tcZo-c. ##<7)3% h^yyA^-^t:^L"C%$%(:#a'fl:i-67;i/=fvx

y/A (mGcf) ty;i/(0##[:^(f&#^<7)#6m^^m@(±,

fLT# y - ^ h LTmm$fbTv^^

(D^^n^zmsmt&ztfr'e^ftfr'Dfzo zotztb, SSS^eyyvti, izkA,£ 79 V i990^m%^mkL^yyA7a^j:y yyA@g^j(:Nf&$i^^^^ M3yma(:##2ii"c#&o #c, 7af-^-#%<^ty-7@^ja^af^<7)%M

we%#%c#mf & c a ^-c #, & 2 a c ^

O/to #i& yv A#mmm<7)^A<7)m#ty;i/h L-c. HMM^g§ti-a^0 hmm t±

msfi&cad^d'ct. HMMi±#^#&%jEm#m

-35- V-A'>7 b^7-£h*

#mt&c2:M#6'e6&o $-^#m"cyvA#j&^mmnr#"c&&ch[±, #t;g@

7cib$fL6C2:(:#Bf6o ^(7)#. %7

7& 7 c hM# h ^ . E 1-15 C HMM t: j: & y 7 A t y ;P7)##e!l Protein coding region H------H

El-15 HMMC Z & y 7

HMM(±^ v b7-7rmm$fi6o 7-K(i^^mL. 4m

A 'yStiTv^o HMM"C(±, $ x b 7-7 b *°n 7'-1 tf )W?7 7 - 7 B £ 0 X g?& ^#-ey7A@E3F!i<7)##^f T'v >y$4i-cw&o

E 1-15 CO HMM ti> 7* W £ ^ 77Wk L & 7) "C & S = 2(7)#. V

b 7 — 7 l'-t'Dy-(;ii < x 7 - b 7 K X l*lS 7 K>> Xb y'fnyy

av'7-4-^ 7 ;v^$S6L> X7- b 7 K>^i5x b 7 yn K>£x:<7)S£b* x b 7 —7^s

7-rV 7 7'#memmLTv^o i8Z(:m*SLTV'&#gB7 K>7)$x b 7-7 b*°n y-[=L 2^7 K><7)mmmm^7-y^ -cv^.

)&£> 6B7V-Tp<7)jlSS^(b)(j;^7)illt<7)IFi^SSS:S:SL-rv^o x 7-b 7 K>^ <7)3#M(c)(i%7-b7 p^ggn K>^03##^(d)(i2m7 K>7)

— 36 — HMMTkL amm-ribao

X 7 - b n K 7 (7)±i,f C Shine-Dalgarno K?'J (D.TSD E^iJ) 22) fcnflf zmmira hmm (^H x .im 1-16

T 1 , I I i ■ 1

E1-16 Shine-Dalgarno @B£iJ <$r MM'i' a HMM

^<7)Tmi:SDKF!l^^X^- j:V'o E a. v-7iNmMit±sd @d?ijth<7M£@iuttas u zat 'f-ymmMb^L^c f'J-y 3 xanDi(±SD8S^iJ4 1 t S^(7DSSlilS(ijHSi"af$E^i3{ia^> -7 y^ESliffiS-rao C(D j: 9 (:. HMM -e i± f- - 7 @B£|J <7)1SfR £ y ^#§©^077 7 )Vb LX^Mt a c t a0 gffiW^yyASS^iJIflRSrE 1-15«HMM^M-C^ao 7 ASe^iJ^Ba GC (:"@A^(G + Crich)#ma AT UWA£'(A + Trich)#my#l&El-17(:^i-^o E 1-17 (7) HMM W\ G + Crich K 7 A 7 t A + Trich K 7 T 7 *°- ■T' V 1 "C'fUfiSt $ flX V' a o ^ 3 7 rb°— ^7 b E 1-15 (3/jvf" y bV-XhEDT^a. El-17Tti2c»<7)3 7^-^7b^%S(:^L, ^-(7)^^

Ei-i7 B#%»y/A##emmfaHMM(7)M

— 37 — G + C rich b A + T rich ^ i± A + T rich ffiMfr bG + C rich

7 t i> oimn) "C6&. 0 1-15 # HMM C#f 6 7 1/ - A '>7 ix 7-^T7V?)ffl&iX&f!j£g| 1-18 CC-eii. xr V> AAAWt^^f;^/^ L

Deletion

Insertion

Match

Begin AAA End

01-18 7V-Ay7hi5-^at5HMM©W

TV'&o ys

4 7^-i/g /

HMM^Ii, Viterbi7;vrf V XA24) tm£tiZ>t$MT )V^f') XACi oT, 4x.«b*t

^77 6^##%m# £1t & 9 C t ^nTtg-C* *» Viterbi r;^'JXAm HMM cm* acz cT. *Ah7)y7 AK^ijcM (SD@m, G + Crich##L 7 V-*'>7bX7-) (DilLUmfeZil&^tlZte&o

% £? 1) Fickett, J. W.: Recognition of protein coding regions in DNA sequences, Nucleic Acids Res., Vol. 10, pp. 5303-5318 (1982). 2 ) Staden, R. :Measurements of the effects that coding for a protein has on a DNAsequence and their use for finding genes, Nucleic Acids Res., Vol. 12, pp. 551-567(1984). 3) Gribskov, M. and Veretnik, S.: Identification of Sequence Patterns with Profile Anal-ysis, Methods in Enzymology, Vol. 266, Academic Press, pp. 198-211 (1996). 4) Bairoch, A.: PROSITE: A Dictionary of Sites and Patterns in Protein, Nucleic AcidsRes., Vol. 20, pp. 2013-2018 (1992).

-38 5) Mulligan, M., Hawley, D., Entriken, R. and McClure, W.: Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity, Nucleic Acids Res., Vol. 12,pp. 789-800 (1984). 6) Guigo, R., Kundsen, S., Drake, N. and Smith, T.:Prediction of gene structure, J.Mol. Bio., Vol. 226, pp. 141-157 (1992). 7) Solovyev, V. V., Salamov, A. A. and Lawrence, C. B.:Predicting internal exonsby oligonucleotide composition and discriminant analysis of spliceable open readingframes, Nucleic Acids Res., Vol. 22, pp. 5156-5163

(1994). 8) Uberbacher, E. C. and Mural, R. J.:Locating protein-coding region on humanDNA sequences by a multiple sensor-neural network approach, Proc. Natl. Acad. Sci.U.S.A., Vol. 88, pp. 11261-11265 (1991). 9) Snyder, E. E. and Stormo, G. D.: Identification of Protein Coding Regions in GenomicDNA, J. Mol. Bio., Vol. 248, pp. 1-18 (1995). 10) Reese, M. and Eeckman, F.: Novel Neural Network Algorithms for Improved Eukary-otic Promoter Site Recognition, Proc. of the 7th Int. Genome Sequencing and Analysis Conf., pp. 16-20 (1995). 11) Krogh, A., Mian, I. S. and Haussler, D.: A Hidden Markov Model that Finds Gene in E.coli DNA, Nucleic Acids Res., Vol. 22, pp. 4768-4778 (1994). 12) Kulp, D., Haussler, D., Reese, M. G. and Eeckman, F. H.:A generalized hidden Markov model for the recognition of human genes in DNA, Proc. of the 4th Int. Conf.on Intelligent Systems for Molecular Biology, Menlo Park, Calif.: AAAI Press., pp. 134-142 (1996). 13) Yada, T. and Hirosawa, M.: Gene Recognition in Cyanobacterium Genomic Sequence Data Using the Hidden Markov Model, Proc. of the 4th Int. Conf. on Intelligent Systems for Molecular Biology, Menlo Park, Calif.: AAAI Press., pp. 252-260 (1996). 14) Salzberg, S., Chen, X., Henderson, J. and Fasman, K.: Finding genes in DNA usingdecision trees and dynamic programming, Proc. of the 4th Int. Conf. on Intelligent Systems for Molecular Biology, Menlo Park, Calif.:

-39- AAAI Press., pp. 201-210 (1996). 15) Dong, S. and Searls, D. B.: Gene Structure Prediction by Linguistic Methods, Genomics, Vol. 23, pp. 540-551 (1994). 16) Bellman, R.: Dynamic Programming, Princeton Univ. Press (1957). 17) Snyder, E. E. and Stormo, G. D.: Identification of coding regions in genomic DNAsequences: an application of dynamic programming and neural networks, Nucleic Acids Res., Vol. 21, pp. 607-613 (1993). 18) Haldenwang, W. G.: The sigma factors of Bacillus subtilis, Microbiol. Rev., Vol. 59, pp. 1-30 (1995). 19) Burset, M. and Guigo, R.Evaluation of gene structure prediction programs, Ge-nomics, Vol. 34, pp. 353-367 (1996). 20) Fickett, J. W. and Tung, C.-S.: Assessment of Protein Coding Measures, Nucleic Acids Res., Vol. 20, pp. 6441-6450 (1992). 21) Borodovsky, M., Rudd, K. E. and Koonin, E. V.: Intrinsic and extrinsic approachesfor detecting genes in a bacterial genome, Nucleic Acids Res., Vol. 22, pp. 4756-4767(1994). 22) Lewin, B.: Genes, Oxford University Press, 5th edition (1994). 23) Ikemura, T. and Aota, S.: Global Variation in G + C Content Along Vertebrate GenomeDNA, J. Mol. Biol., Vol. 203, pp. 1 13 (1988). 24) Levinson, S. E., Rabiner, L. R. and Sondhi, M. M.: An Introduction to the Applicationof the Theory of Probabilistic Function of a Markov Process to Automatic SpeechRecognition, Bell Syst. Tech. J., Vol. 62, pp. 1035-1074 (1983).

(5)

GenelD X h <£> Guig G t±> £ 'G fi o JV — — x v X -r A GeneIDem%L^:^o GenelD

DNAKm:*HI:L"CV'6)o #XT?7

— 40 — Amc[±^<7)Z9^%T7y^#"C?m^e^9o $f^ixTvyc. ^^7Ky, Ky-y4 i'. 7Xty^-y4i'. &o ^2 XTvy-eii. #^;i/-;i/[:%c"C#lJ:^V>, #$&J:^yy^?#i-&o #3XTv y-eti. #2%Tvye?m$fL^%^<(0J:4rv>y &9o #4%T7y"et±. Z (^)%3760^V'{)6D^Hi^h L"C]&fo #B1 XT vy^)^M3 K>, Kf-y4 b, 7?ty?-4M b. $^%7 KXD^#(7) m;m3K>#<^±-C<7)@#(0Mya7 7 4" A'&fWL. &o #2%TvytDJi^v>(7)#^:XTvy-c#, ^c(oz^^y-;i/"e7.^y y,x2-Kf- y4 b,xl

7 xd&W. (xl, x2) &v >X d *7 7- AWtf&o Bi&J^yy### (xl,x2) - 7/f u yy-cd, y(:-7v^y#%a (24#) £BtSU -en^(7)ttt:$yv^yjLyy #Eafc LT> 7-yy y y a # 7 - f /f > y ##-?# ^ ^ e y & c a fLy ^

#-e&&o #T.yy E<7)# (j:4ry y • x77) ^ritSL> fiv'xn7«i^yy^it^o #4XTvye(i, #u:yyy. o#m±

— 41 — • predicted first exons

predicted internal exons

• predicted last exons E1-19 GenelD fGenelD

9 7i/-A&#c-c&tx

%> o

GenelDli,

mrni^f &x t tt\ $ 7b7-7^6-eblast em

GenLang X -y —T^:^<7) Dong t Searls t±> j5C#laX)^# t L-C%x.^GenLangy%TAem%L^:')o

GeneLang T*MMMU £ f- 3 A x * -SWeSemi"^0 t±, ffl&nF>, K b. yxtyx —-e&&o mmc(i#*%^#7^A. mm (3XF) *%m#x

7 X xllwnx b frb, &a?)y > X 7 XXfpnx b &HtM'tZMl z&^Xi'&o

— 42 — GenLangkL

GenLang l£3Ci£ £• ffl ^' T 43 () > ~f n X 7 -Mi Prolog SE®i3M-h'C'S!)l^i" 6 0

DP£fflt'£¥& v a >X • Jv/*>X*^<7)Salzberg bimfeXtm#)'? n^7<>x (DP) 2tu±, ili + VX >X#m) V\ DP 1= j: C0 ^ h6 m-c&ao DPi:z&###(D^-eii. O & 2 b 1: J: c"Cim&#jS^±lf & 2 a i:m%L"Cv^o

PAc =##(:<&5mefim&xo ^9 A-eii^v'^ iiu^wj:l-c, ie &T;i/XVXA^g^*#^L"CV'6^o # #%ea LT@g^j&- jM&my >;K;H:/f (^ y/^KSB^J-CT <

x m 1) Dong, S. and Searls, D.B. Gene structure prediction by linguistic methods. Genomics, Vol. 23, pp. 540-564 (1994). 2) Guigo, R., Knudsen, S., Drake, N., and Smith, T. Prediction of Gene Structure. Journal of Molecular Biology, No. 226, pp. 141-157 (1992). 3 ) GenelD-Email Server for prediction of Gene Structure, geneid @ darwin.bu.edu <£ 0 X^. 4) gswg. Y S xr- 9 frbfDtomft&MWetM., Vol. 37, No. 10, pp. 935-940

(1996). 5 ) Pereira, F.C.N. and Warren, D.H.D. Definite clause grammars for language analysis. Artif. Intell., Vol. 13, pp. 231-278 (1980). 6 ) Salzberg, S., Chen, X., Henderson, J., and Fasman, K. Finding genes in DNA

-43- using decision trees and dynamic programming. ISMB-96, pp. 201-210

(1996).

1-1-3 E#<7) $ >A? H =i - KWiTOJJ y X x A £ <7)M (1) ?y/

- CftG<7)

^^^^#7x7^(1^1)^^311^660 c ftG(7)7X7A^n#C(±3- F#^(7)?#/fUT^<, yat-^#%^-E-(7)#(7)##gg @(:oV'T6?-#LT< ft6#@BSr#ATt'6 6(7)666o &&, C^l6(7)1:#y%T A<7)m&cm^cTi±*a&Am%y%TAm#m^ (##t%:aamAH^mT-z*# #t&^) (7) r^#ft7;i/^0XA#My%TA^)^%t:^i-67/f-yifUf/f%^T'f #^#J (?#8

GeneFinder GeneFinder (Baylor College of Medicine) (± H tz 6 ## £ 6 o /i 7° cr X" 7 A coS t 0 ^G^6o t(7)*<7)FEXIi, &##K(7)^D(7)t'<'0^(7)#%[:^(t&tVd'v-(7) MTxyy/fxggK ^^(:ZD#Gft^#%(7) asm (%3 7) em^fjgij^#<7)emTim^L, 7 7, 7 &o FGENEH & t"C(±, ZtlbCD^? 7 'sCDcpfrb, A# (7)X37^#AC^&J:^77(7)m^-drem*t&o #1&, ^ta7—9 — T #/&/<-7 a 7 FEXHB, FGENEHB Sr ftp site TAM VX v>6»

GenelD

GenelD (Boston University, USA)I±> X ~f y 4 X Sf£<7) X 3 7 in — K 7° t 7 7 -v jv

(a-L§SrX37fbt*6(7)) <7)f (7)##T<7) ^SrE-a-tir'C^SJ

■f&o ^(7)a #(:, A--t7 h n7Srfflv»So E^ESTr-^-x b<7)*^n 7-+h

— 44 — 1-2

7 K LX (URL)

GeneFinder http://dot.imgen.bcm.tmc.edu:9331/gene-fmder/gf.html (FGENEH,etc.) * l

GenelD * 2,*3 yi —ty b n y http://www.imim.es/GeneIdentification/Geneid/geneid_input.html

GeneMari: *4,*5 71^7 tf)l/ http://amber.biology.gatech.edurgenemark/

GeneModeler *6

GeneParser*7,*8 k b http://beagle.colorado.edu/"eesnyder/GeneParser.html totoins

GenViewer *9,*io k h http://www.itba.mi.cnr.it/webgene/

Genie * 11 1 k b http://www-hgc.lbl.gov/projects/genie.html 7)1/37^^ 1/

GenLang * i 2 k b.vTtX.-y 37-73 http://cbil.humgen.upenn.edu/Xdong/geniang_honte.html

GRAIL *I3,*14,*15 http://avalon.epm.omI.gov/

S6to5fE$ -> 3

OC1 *16 k b http://www.cs.jhu.edu/salzberg/announce-oc 1 .html

Sorfind *17 )V— — 7s k b ftp://iubio.bio.indiana.edu/molbio/ibmpc

PROCRUSTES *18 http://www-hto.usc.edu/software/procrustes/

MZEF *19 k b.79X,yO*f XfXf http://clio.cshl.org/genefmder/

*1 Solovyev V.V., Salamov A.A., et aLPredicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames..Nucleic Acids Res-.Vol.22 No.24,5156-5163(1994) *2 Bronak S., Engelbrechl J., et a/.:Prediction of Human mRNA donor and acceptor sites from the DNA sequence. ,J. Mol. Biol. ,220,49-65(1991) *3 Guigo R., Knudsen S., et al. :Prediction of gene structure.,J. Mol. Biol. ,226,141-157( 1992) *4 Borodovsky M. and Mclninch J.D.,:GeneMark: Parallel gene recognition for both DNA strands.,Computers & Chemistry, 17,123-133(1993) *5 Borodovsky M., Mcininch J.D., et a/.:Detection of new genes in a bacterial genome using Markov models for three gene classes.,Nucleic Acids Res.,Vol.23 No.17,3554-3562(1995) *6 Fields C.A., Soderlund C. A.:gm: a practical tool for automating DNA sequence analysis.,Comput. Applic. Biosci.,6,263-270(1990) *7 Snyder E.E., Stormo G.D.:Identlfication of coding regions in genomic DNA sequences: an application of dynamic programming and neural networks., Nucleic Acids Res. ,21,607-613(1993) *8 Snyder E.E., Stormo G.D.:Identification of protein coding regions in genomic DNA., 3. Mol. Biol.,248,1-1 8(1995) *9 Milanesi L., Kolchanov N., et al.:Guide to human genome computing-chafer 8. Sequence functional inference,249-312 Academic Press limited,(1993) *IOMilanesi L., Kolchanov N.A., er a/.:Genviewer: A computing tool for protein-coding regions prediction in nucleotide sequences. ,In Proceedings of the Second International Conference on Bioinformatics, Supercomputing and Complex Genome Analysis,573-588( 1993) *11 Kulp D., HausslerD., et al.: A generalized hidden Markov model for the recognition of Human genes in DNA.,In Intelligent Systems for Molecular Biology, 134-142(1996) *12 Dong Searls D.B.:Gene structure prediction by linguistic methods. ,23,540-541(1994) * 13 Uberbacher E.C., Mural R.J.: Locating protein-coding regions in Human DNA sequences by a multiple sensor-neural network approach- ,Proc. Natl. Acad. Sci. USA,88, 11261-11265(1991) *14 Uberbacher E.C. , Xu Y., ef a/.:Discovering and understanding genes in Human DNA sequence using GRAIL..Methods Enzymol,266,259-280(1996) *15 Xu Y., Mural R.J., et ^Constructing gene models from accurately-predicted exons: An application of dynamic programming.,Comput. Applic. Biosci.,(1994) *16 Salzberg S.:Locating protein coding regions in Human DNA using a decision tree algorithm. ,!. Comput. Biol. ,3,473-485(1995) *17 Hutchinson G.B., Hayden M.R.:SORFIND: a computer program that predicts exons in vertebrate genomic DNA.,In Proc. 2nd Int. Conf. on Bioinformatics, Supercomputing and Complex Genome Analysis, 513-520(1994) *18 Gelfand M.S., MironovA.A., et a/.:Gene recognition via spliced sequence alignment. J*roc. Natl. Acad. Sci. USA,93,9061-9066(1996) *19 M.Q.Zhang.: ldentification of protein coding regions in the Human genome based on quadratic discriminant analysis. ,(in press)

— 45 — Z 9 2 GeneParser j: () Av^Ag#L"C^&o L^'L, y — f-iK ##%D 6^T^V'o

GeneMark v 3-yTI^-eil6§tifc GeneMark(iORF t 5th-v v>T^ -Bo Haemophilus influenzae, Methanococcus jannaschii, Mycoplasma genitalium ta $ ii xij $ £ t f A 4i M t> h 0 .

GeneModeler GeneModeler (Ceter for Advanced Computing in Molecular and Cellular Biology, USA) (i. C. ORF, %79/f %##

&X GC^I* diamino, dinucleotide<7)MSt& £'% X rr tfbL*C^.

GeneParser GeneParser (Colorado University, USA)(±#}ili<7)iBn"fb{- —^ —7 ■; 1-7- f&fHV'TV'&o

GeneParser 2"C It, GC#"#BUl:m*cf(f &&X.& j: 9 C GeneParser 3(iC

GenViewer GenViewer t±> X79 f Xg&{£?)7 3 T, 3-K#r > v 7 :txr;x?) fHifcffV', i^v>x V

Genie Genie (UCSC/LBNL, USA){±—flS'ftKtftvfl' 3 7;t(Generalized Hidden

-46- Markov Model (GHMM)) ^ nrr-WHMMtli, —0(7)/- Lri'U uWGHMMt'ii—owy - > b n ><7)ii^ii#SLTv^0 y- KR9

-cv'&o y/ff

1996^6^1:ISMB96"C%m%, f&y%TA(%)^c»hux.Z9o

GeneLang

GeneLang (Pensilvania University, USA) (J ilfci1 ^ tOV £ v > b t$> & jCi# co jU/£ U il alX, Prolog, of i), ;V-;v<7)S’eSEL^o C sntoul±, (Definite Clause Grammar) t LTBSi&SfiTV'So i£P^;K7)X =r rtf1—5Mc-e*ioTi't\ *#U$ftT-E-ft<^)#f@#CJ:^y>(±3-K^T>yf X3Ti:iW( U/io>iiX7'7-f x§Hi©X3T5js±t:^otv^o

GRAIL GRAIL (OakRidge National Laboratory,USA)(i> K#MP#J >

&o 2tt“^-7 7yL^#muBu-c0^61 i-&Z9U#B$-drTt'5o

OC1 Oblique classifier (OC1, Johns Hopkins University, USA)T(±, i^/ET^T-ii; V'TV'So #fi<7)ESt#l®C) X n T U^-3 ^3- ##3- K#%?&&d'&mB!lT&o

— 47 — Sorfind SORFINDli. 60bpm±(±^fi^GU-AG^tcORF$"*c»^. %794xgB{&(7)X 3 7, 0RFO3-K#T>yt;l/, ^T(7)%3

l±, #X37(D$&#i|g^$:L"C 1^ 77(7)3.9 9

PROCRUSTES

PROCRUSTES(UCSC, USA)1±£ < # W t) ft & v ' %m KM LX, oft),

L^v>S^iIiJ-e § & vv nmM

MZEF

MZEF(Michael Zhang’s Exon Finder, Cold Spring Harbor, USA)(±4IF4^ 9 9 9 &

#m-ei8bp (79lf K79X(±9bp), ##"C999bp (79

k: K 7 y X U 2,000bp)(7)3. 9 9 9## &#& L, 9 #<7)##(7 9 tf F 7 9 X l± 10 #)e@ -3-C-eW^^ V 9f #$^1/2 Z «)±# v^h^9 9 9T6& 2:

(2) af# r(l) 9 9/<9*3-F##?#|9XTAj C^V'-cm^L^^miyXTAiim^L'C

IfOfSSiM ?z h h o ^'o y 9 A El^iJ 6 gene structure & i^ill t h y X 9 A f± t' (D< 4'#zb&t'9%TA&m%i-3±Tmgg-e*&o

$%, 7079 ^^#^, j: 0 6nT#'C&&o

Cft3'f#Lw yXfA /5 X t

t L"t(±, R. Guigo

-48- Xff-f #5#J (?^8^ 9^) Guigo&kL cm±A-3Tv^####<7)

‘complete gene ’Bfi^J (Itn^ K 6#-% 3 K > $ T<7)i"^lV-CAcTV^6t60) ^^V'T#yXTA(7)/<7t-'7>%^#g^'CV'6o F^v# Tti, 2 fi 11±^ < ItM<7)T— 9 (£$5®f±Homo Sapience Oh) vt W$H&S £F'B;H t:o TV' &o mTT(i. r^#ft7;>XVXA%MyXTA(T)m%(:Mi-&7/f-ykrVT/f%7 #*a#&/<-xcLT. #y%T (mi-3#m<7)2a)o 1-3

Exon Nucleotide Level Exon Level Tool Loci actual predict Sn Sp ACC AC cc Sn Sp ME WE GRAIL2 188 571 529 0.65 0.87 0.88 0.59 0.62 0.42 0.46 0.19 0.12 GRAIL2(gap) 188 571 513 0.54 0.66 0.81 0.46 0.50 0.49 0.63 0.24 0.14 GenLang 185 568 685 0.47 0.65 0.77 0.32 0.35 0.36 0.30 0.35 0.37 FEXHB 188 571 762 0.71 0.78 0.88 0.63 0.63 0.47 0.35 0.20 0.37 FGENEHB 188 571 492 0.57 0.84 0.85 0.53 0.55 0.36 0.42 0.35 0.14 FEXH 188 571 650 0.73 0.77 0.85 0.64 0.65 0.55 0.48 0.15 0.26 FGENEH 188 571 557 0.63 0.78 0.84 0.54 0.57 0.54 0.55 0.19 0.16 HEXON 188 571 486 0.46 0.74 0.77 0.36 0.42 0.42 0.49 0.29 0.17 GenelD 188 571 645 0.60 0.73 0.85 0.51 0.53 0.43 0.38 0.29 0.32 GeneParser2 188 571 624 0.61 0.71 0.83 0.54 0.54 0.36 0.33 0.23 0.29 GeneParserS 186 582 588 0.76 0.82 0.90 0.71 0.71 0.51 0.49 0.15 0.19 M^>± $-7cc#?miy XT AT##

v^f^. U-tirJT xr; t>V' ttzi-tV ^Tfi^V'h?#] tv' < 7 7T(i^^c^^ (&!##: TrueNgative) . zx. 7 7 > £^{gij Ltz d v* < 'D'ft-fitzfc (H&R§ft:FalsePositive). £. tz (##%:FalseNegative) (±^^7 7%) !E#T&&o F#%J h F#&&J hv^2c%#^L^&(D$LT^ z.bti& 0 FSSJ (Sensitivity) Sn(±> ^l##e ({SB14 + SH14) T#Jo£ffiT|Hi:

49- &o (Specificity) Sp(±. (#1##+ &!##)

(co, mmac (Aco. Average

Conditional Probability (ACP)> Approximate Correlation (AC)& b'frfo&o C co#i -1 < CC < 1, -L v ACC(im*l%^;i/'eipr/<-t> HEL

#(±, CC E miS^O.Ol frb0.03mmirtiZ>M\n\lz$>Z>o mi-3

x.6fi&o VXD") ^M-a-^Missed Exon (MEX

^BJ L£:x X y >

mi-3(D Texon-levelJ

Sn ##67f-L^#1^

Sp : n^lxtzz-? V xdi Xu

me : SE<7)j:y v >§I^MtiMMT-§&7)'o;E£<7)<7)S!l'n- WE : y<7)y

mmttMt Utz^myX-TA

a 1-2 {Z&'i.tz^m '> X t A

^tz o M GeneMark (##&#@^F.coZ^(D^^),

GeneModeler (MC.elegans 0)tz£>)„ Sorfind (^ XcD/cM'^tzfjj{$L4'

fr'ifz), GenViewer 0C1 (xr;'/H>lo >©llE L

Ef-^t:it£>LO'^o) Coi'ttirX lLti^iV>0 £ tz, MZEF,

PROCRUSTES

& jo > GeneFinder (FGENEH) cry-^-f IS "C#&Z

0 tC*fS$•££:/<- y a X -C% & FEXHB, FGENEHB i:ov< M Lfc„ GRAIL2,

GRAIL2(gap)i:: o v> T tiy < — -/ a >(X-grail, verl.Sc^'Effl Ltz0 GRAIL2(gap)tix

-50- fpflfi 4*Hl<7)SllST't±iE 1-3 <7) AC J: o t- GeneParserS >5*1: ov>x\ FEXH^ GRAIL2 i)%i. v>0 GeneParserS /5$ftSf§::^ £ v^ti, n y — o t. < t CATv^^»()T&69o jfe^FGENEHB, FEXHB, GenelD li^r<7)t> l) (Zft, hi*) #1##^ L < exon level T O If liffi X:' (± > exon-intron h %(±Sn, Sp(D#^*62:FEXHT66o ME#^^ FEXH, GeneParserS Tab&o i^i! L tz exon T# & WE Wfrb GRAIL2, FGENEHT*^^. WTx

omvxDmmsmm 1m&3-?yy t£ GenelD H* X < f&ffi T § £ 0 f Mi * V > t± GRAIL(overlap ^ < tt) h\ FGENEH(exact 0

GRAIL GRAIL(gap)^ J: v^0 &Z<7)%m&XZ>it SfeA&o GRAILZi)tGRAIL(gap)m(%9^j:<^&o ##

1-1-4 mmcekaim^: (1) cDNA-fishing cDNA(±^±^y/ADNA^6^>/<^#^3- PLTV'6#m^l)m$^7t:6(?) "CAD, OIW'fiTV'&o %^TcDNA(±. $f ^ ff 9 # A C # 6 & V'#$f anm #$f ## DNA Xtfrb, cDNA«i^ft^^o->tt7 0'7 V-tLXti%y > ?&\Z-?r\Z's- L^L. yVAl:3-F$fLT v' 6 AT #m & AT (7) cDNA & %# L T##f t & ^ h (±#m ^ C ^ T (±

- 51 - 4-2 mmemmhL-cyyA DNA*'5v4±cDNA^^a-->^$fi#$f$^6ca^#^c^o L^'L.

M(:-e^)%^#A-CvX h6#*@dmmi:imL"UaA:T?)cDNA/<

y 3f ;i/y ^yy ADNA4)^6#m%m#i#

2(7) j: 9 &$Kjfctf>$*X\ yy A DNA <7)MBd?iJM;y4- x. b titzn \z * <7)M^.^%M(7)#^ $w:#m

cDNA-fishingh(i. M^^cDcDNACMf&mW^x.^^#::, f<7)^e#/2 f Z9&cDNA&#mM(:%#i-62&&#;t6fi, 77-ycDNA9^y9V j: 0% %$fi^y L^y a-->y^/\/f y v f/f 4f- y a y^rMv^y a-><7)aigiJ(:Z&y n-z:yy h#x. 6fi &o L^u. ±EcEaL^m^T(±, %& y >;7 7y

^<7)cDNA^%#LT%#%##emA'5h^9#%(OmW*m^$fL. CCTD^-C^

yy^-^y < y l±V'x.f. 21 &#{&Lt±?. z c is % fit ;m & (7) f-emm-e # & & „ m±<7)± 9 I-cDNA-fishingt±, 9 >^y®*3- Y'-$ &^#(7)cDNA

-#mc#PCR&m'"c

-52- ##C##f6o 2fi6(D4]'C^m^cDNA'r&&2 2:^^ cDNA(7)5'^Z^3'%e^ n —— > y-f h RACE & (Rapid Amplification of cDNA Ends) <£>—® "Cab & h

zm^fz PCR K£Z> CDNA <7)S#M<7)M n £E 1-20 C^Ltc mRNA (polyA RNA) ^ 1st Strand Synthesis

y- AWv%iWVAVAPATUVVWAW/WIV- XULUli-1 r-r :-:_v.;-iJi^.—--.-r-iKli'nTiT;r=i-y ^ 2nd Strand Synthesis

cDNA — ’ =^:%3 ^ Adaptor Ligation

1 PCR Full Length r_==------1------cDNA ~ * !f El-20 7 ^7°7-e#t)W^^mcDNA(Om#

z-f. muh S VMS& ^ mRNA(*° V AttiO RNA)£i6ffiL. t')=r( dT)79^ '7-&m'TcDNA&-&#L, y-va

*\m-tz> 0 c^cDNA(Dm^#e#ma c-c. mpsem v -(gspi &£td±GSP2)i TV?97 4 v-(APl)£fflv>TPCR£lT p - tlzX 0, TSEfclfiJ

—(GSP1)cDNA<%3’ffl/$C ±E* IrJv *tz 7 y A v-(GSP2)

emv'tiecacDNA^ 9 C GSP1 t GSP2 ^nefi±Ei3 Z rrFEKftfi-r & £ 7 K LX 5’ fc 3’-RACE # £ 9 (cEltL-Cti < ChCZ^T. 3’* £ V5-RACE «M^n f ? hem-a'LT7--;p$iir^^ $^i:^#<7)#SRS^iB^ TV'fr-frhvwmKmsmz d tiL^o> Gsp ^-e # & ^ ^ i: 7 - - K Shuttle PCR. Touch down PCR2) &£vd± Step-Down PCR3)

-53- ##(08 V' PCR 9 6 a o (2) hybridization % f i/f f- xmmmMc2a%we#muy. a DNA & a V4i RNA e#m& a a ^ * I), < #BU t^63KM$frCv\&a#T6ao #6ftm%^6^)(±y^7/\ 4 7" V V 4 -tf— 7 a 7 (Southern Hybridization) "7% V > DNABrlt h n

-tr;vn-xE^ hi:$5%:U FylSS$ *l£#£<7)SSE?iJ DNA Brlt^yo-yh Ly/^yv y/f-tf-y a 7$-±^#, t- b9^ty77/f V'T#mi-a^&"e6ao 7V y^f-tf-y 3 7^#(:(±#/(r^/

V y t -t: - y 3 7 & ifvHtMy & a 0 t fcSJBffl b L*Ct±, 7 4 7 if-?V 7 r 4 7 7\

T7 7 7 7 7 y -v )V/\4 7"V y 4 -tf— y a 7 4\ SBH(Sequice by Hybridization) 5’6\ FISH(Fluorescence in situ Hybridization)^ 7 < v>/:? n — 7 BISKS i: <£ a#m&^z 6hi:z c y ,

jjHb LTt±> tf^-f-7 - 7 tfy 7co|g^- J^DIG(Digoxygenine)i:'? ‘<7)i^#^:fflV'7t:t) <7)&av4iFiTC#(om^#m^mv^6%^m'e&ao $6s. la^yii^myvx *y$&*?$k%tofa$L*mwLx>'4 yv yt-tf-y a 7em###Kc#mtach 6M#(:^c"C#-CV'ao z\/ yv y^f-tf-y a 7(±m±(7)j:7(:*'^(:m^»%%T6a^

/f yv y/f-e-y a 7Bm

— 54 — 1) Chenchik, A, Moqadam, F. and Siebert, P. (1995) CLONTECHniques 9, 9-12. 2) Don, R.H., Cox, P.T., Wainwright, B.J., Baker, K. and Mattick J.S. (1996) Nucl. Acids Res. 24,4008. 3 ) Hecker, K.H. and Roux, K.H. (199 6) BioTechniques 20, 478-485. 4 ) St.John, T.P. and Davis, R.W. (1979) Cell 16, 443-452. 5 ) Drmanac, R., Rabat, I., Brukner, I. and Crkvenjakov, R. (1989) Genomics 4, 114. 6) Dramanac, R. Rabat, I. and Crkvenjakov, R. (1991) J. Biomol. Struct. Dyn. 8,1085. 7) Scholler, P. Karger, A.E., Meier-Ewert, S., Rehrach, H., Delius, H. and Hoheisel, J.D. (1995) Nucl. Acids Res. 23, 3842-3849. 8 ) Regalado A. (1996) Technology Strategies 24-30

1-1-5 "1-1-2 (4) tr* ####, #CBftv(Hidden Markov Model: D.THMM)<^f#fi14

XAO&A

HMM N'it, Viterbi 7 )V n" 'j XA u t Uf l£fi& WjffJStl@i(X (Dynamic Programming: DlTDP)(7)-aM!t7^rf0 XA t HMMKZ&YJ

ifti )i &Viterbi7)Vrf V XA<7)pViterbi7XV XA? HMM $ 7 1 Xi hmm k j: o rry mi^hmm^ ? 17-7 (NeuralNetwork:mTNN)^€'^i:^&o

- 55 - ^ Ltx Kulp biziz,-mt HMM(Generalized Hidden Markov

Model: W'L. GHMMT(i#^CNN^h'

TV'&o ^MyyAK^J^±#(:#

&m&Viterbi7;i/=fVXA&m%f HMM \ZZZ> rs^rnm m

;i/^VXA(Dm%^'g"C6&o Viterbi7;i/=fVXATI±,

?/f yc

7 zf v #$%(:$#%#&m#fSViterbi7^^V XA $#m#e^$i-&Viterbi7^^UXA|±, yyA^Cm#LX

mnmcw%-c&&ck;Mm#$ft&o mma^^im^c^Krogh^

Krogh^^fm-Cli.

HMM * y b 7-y <7M7;v:7V XA<7)H!#§ HMM L/tyy A^E7';V(7)#SXd±, HMM $ v b 7-7<7)£tii7;vrf v XA<7 HMMtlt Baum-Welch 7 ;v 7 >j XA XA btvX^h l)0 Baum-Welch 7 )l rf V X A \imit&(D-UX$) »K * y b 7 - 9 b *° Oy-ii r7 W* (7)#^#^ z.bii& t XtlX-? iZftLX j - ? (DfmPJf >& biLSo ##i" 6 2: ^ Baum-Welch 7 )V rf >J X A f'lA $ v b 7 — 7 b -fn v - i / * 7 y - 7 <7& & o HMM £ tBffl L fc XV A -t r (7)##r (±N ¥ / A E^iJ-lfiaii HMM <7* -7 b 7 - 7 b *° n v- £/\7 y - 9 i: X o T^m^tiTv

So io-c, yy AK^m#^mmtsHMM&m#tfst:(i,

'IffKi 0 2<7Z 7 HMM * v b 7-y <7 b *°n 7'-£ ¥? *-?

HMM b 7 - ? & A

#"7^(7)#%e#^#^$fL"Cv^s6,7,8,9)^ Takami6<7)#&2:

-56- Krogh Gii. left-to-right

HMM $ v b 7 — ? £$"$■ 6660 Cl

st. cfL6(b#&"C(S. Fujiwara Ct Yada £> <£>#££ !±. —#%tHMM $ v St. C4iG<7)#&TW:. yvAgm^c^u

&t)6^7") - i/ 3 >4^#^4 7-9 —y 3 >4^gg &d*T#&V'o f; X A £ m%-f x m 1) Levinson, S. E., Rabiner, L. R. and Sondhi, M. M.: An Introduction to the Application of the Theory of Probabilistic Function of a Markov Process to Automatic Speech Recognition, Bell Syst. Tech. J., Vol. 62, pp. 1035-1074 (1983). 2 ) Krogh, A, Mian, I. S. and Haussler, D.: A Hidden Markov Model that Finds Gene in E.coli DNA, Nucleic Acids Res., Vol. 22, pp. 4768-4778 (1994). 3 ) Yada, T. and Hirosawa, M.: Gene Recognition in Cyanobacterium Genomic Sequence Data Using the Hidden Markov Model, Proc. of the 4th Int. Conf. on Intelligent Systems for Molecular Biology, Menlo Park, Calif.: AAAI Press., pp. 252-260 (1996). 4) Kulp, D., Haussler, D., Reese, M. G. and Beckman, F. H.: A generalized hidden Markov model for the recognition of human genes in DNA, Proc. of the 4th Int. Conf. on Intelligent Systems for Molecular Biology, Menlo Park, Calif.: AAAI Press., pp. 134-142 (1996).

— 57 — 5 ) Lewin, B.: Genes, Oxford University Press, 5th edition (1994). 6) Takami, J. and Sagayama, S.: Automatic Generation of the Hidden Markov Network by Successive State Splitting, Proc. ofICASSP, pp. 2-5-13 (1991). 7) Krogh, A., Brown, M., Mian, I. S., Sj” olander, K. and Haussler, D.: Hidden Markov Models in Computational Biology, Applications to Protein Modeling, J. Mol. Biol., Vol. 235, pp. 1501-1531 (1994). 8) Fujiwara, Y., Saogawa, M. and Konagaya, A.: Stochastic Motif Extraction Using Hidden Markov Model, Proc. of the 3rd Int. Conf. on Intelligent System for Molecular Biology, pp. 121-129 (1994). 9) E;n#A, &&DNA Vol. 37, pp. HIT- 1129 (1996). 10) Asai, K, Yada, T. and Ito, K.: Finding Genes by Hidden Markov Models with a Protein Motif Dictionary, Proc. of Genome Informatics Workshop VII, pp. 88-97 (1996). 11) Gribskov, M. and Veretnik, S.: Identification of Sequence Patterns with Profile Anal- ysis, Methods in Enzymology, Vol. 266, Academic Press, pp. 198-211 (1996). 12) Bairoch, A.: PROSITE: A Dictionary of Sites and Patterns in Protein, Nucleic Acids Res., Vol. 20, pp. 2013-2018 (1992).

1-2

S# £ W ^ o C> mRNA ti 1 9 ® £ 3 - K LtA'S (f / h V - v

(CAP#1&) &ZCf3'*%<7)#200bp?)7f-;i'm(:Z&##

mM%<7)AUG3K

-58- (SDK^!l, %#^#mRNA(7)^VA A^#mLTtU^(dT)@^il:*9 A^ &'&## L"C mRNA U ^V-A RNA ^ k'f)#?)RNA)0' 6^#fa C hM#"C& 0 . yy e%#L-CcDNAe-e-Rlci-&^^(:#m$fi-C^ &o AWtL^^ca^6im#(D%#^Mv^2Z:(±^»r#T6&^ 1 #mmRNA#fC#%

?ii RNA ^ v y 9 - -tf c z ioobp@&(o^» v -kmc&^amgm -E-cc^mimaf^^LTRNA^v ^9--tfZ:m:#%$)av4±m#%[:

RNA^V y 9--fe*!i^§&f£<7)r RNA * '-M 9- DNA t>^&Zllt'<

fma-a-fa c z: o , &#&#& z mmAgmv'f %##-&(: t a#^mv^acaM#BT&ao

2. Gmmmwmm ■ mmmm 2-1 'A&i-mLo-fn 7 < swmmmmi af (omaamreoem^M^^mc^L-c^^m^z: z^^z 9 %6^ci'acz:^^i;MX'r6ao L^L,

c %m#"c* a m wm

- 59 — Bti&o fee. BodyMapfi. cDNAt’/h7^y a >S> x.y/\y-q-— by y 'f'&tc £"#y ^tHt± Fluorescent Differential Display (FDD)> Molecular Indexing. cDNA Microarray &#B#A"e(±#y ^7)fig#

v - <%## h #m<7) y a 7 f - ;i/ & o at.

71? % p a ; jv x £ M#rt h ~7 n r t - A ffltr^Mfl& F^I X <£> f fi b

-e^mm^at&o at. G-et<. #Lv^f h o 2-1 -1 Fluorescent Differential Display (FDD)

Differential display analysis iiti. ;i/_h"C7 -i >if -YV > bi-^act<#Lt^i&^hLT1992^[:LiaiigaPardee fiTM0. #<(D#%^'Ca4

Wk mRNA@C-7^T(D#m^ Bl 2-1 li DDm(0*m%f & m-e$)6o GenHunter Kit (Brookline, MA) j.y — Xhh bM!,fc>ti&o DD&"Cli#^i"&7°7 ^ffcj^fca I) ?> y b

7°7 A nx b £#x.ta'o’K§ t>ibXjiMX'£>&0 tt'L. tat-It 7 m-A-ii# @ emmt & t ^ta & ^ c^m-e^ &„

a t. targeted differential display t£3,4)

— 60 — -C-VAAAAAAAAAAAA-A.-i -CMTTTTT'T7tTtT.5-

II. ICn-er(7)5'-77-^7-(AGCCAGCGAA) £53v - pqr mi

5‘-a GCC a GCGAA VTTTTTTTT-TTT-5'

S'-AGCCAGCGAA- — CV7T7T7TTr777'

5 -AGCCAGCGAA------CM" 77 77

5' - AGCCAGCGAA IM—-77777777'-;-

5’-AGCCAGCGAA------^-77777777777-:

S'-AGCCAGCGAA •CM7777777-7777-:

III.

i _ ' mRNA6^«7)vv"y K ! ™— vyy.'UAT^iiu',-.'

| ...... ' mRNAa^RffJ/’fy K

— *" — 3 yTVU 0*539^- j — — ■ mRNA£3e<7)Ay K I ?y7;uA 7>7iud 7>r;uc

B2-1

Z I), ffl WE<7)ifJn£0&d $lx.t±\ FGF : fibroblast growth factor) -1 C Z&yff mSEM<7)-tOTic*S£-US14ffc$ tiS d

V7XNIH 3T3$toici5^T> FGF-KC^o-CSSSn^Se^^DDritc j: I) £ (Fnk: FGF inducible kinase) odd "Cffl ^'Tz7°y 4 — ^DrG + f- -fcf K

DNAy-^y-9 —dd-eii, itM'fyjy-b L"C20-mer##(O^<7)&Mtx 7>*-7°7^f v-£ LTBamTV:

CCCGGATCCT15V, MluTV: CGTACGCGT15V (V=A, C or G)£J9ata £0 ce?pcR^mno^^/f yyf^ fz, Z 0 DD#m&l:#t':58;a LT, 10-mer 'f v-&J: 0*7 > *-7*7 'f — % GT15VN (V=A, C, GO< 7H, N=A, C, G or T)£ffli'& d b IZ X *),

W4#6fi&a#d k"C6&o

— 61 — V'9^"C##(7)#m^$)&o DD A ii <7) M14 tr O v > -C i±, -fv 4 -?- Uang6##S#(:i&h80~90%'r6iK 100% "C (i*v' 0 |5j—69$B8B£fflv\ RNA^^>(7)S^^^X X- 1 LTlro t 3 BO® j; L tS DD mmsmcZ & t /< > K^)Rm#(i^

ds-cDNA

(A) "S-dATP (B) “P-dCTP 1 2 3 4 5 6 7 8 9 10

of the 64 adaptors

^ E.coli DNA ligase

Dynabeads- streptavidin

(A) 3'S-dATP, (B) ,:P-dCTP 1:14 DDA #%ai:CV'T% Amplification by PCR

HLf-0 I ~3, 4-10 it. -E-n-eMBlerCffJRL/w RNA il-bZ-9 - h ltdda effottm-ete. uutw/o h'rfnmti Sr4 (*91) K#8H$06v./<>- KtB» "4 tie. Automatic Recording with a 373A Sequencer 02-2 7<77l/>yt^frfxyi//f&w%@ 02-3

7vh-yv>T^>y±-c(i, f&#&<%/<>K

nm^v'/La, at. y;i/&6#omLt/<>K(±±m?)#a\ ALT^o. -a. ?n-z:>y&iT.p'c&6mm@dm*m:m'acw(maLv'7'8)o

2-1 -2 Molecular Indexing &#m±. v'f

-62- O&UNRKH:. iggS^CKatoCi^T

##<7)#%^j:^2:V'Aa. C<7)^(±im2-3(:^i-j:o c. 3mm^ciassiis#m##, 64@m

&^

tch^T#&o m±

^^ K&##L. OKfJj:

2-1 -3 cDNA Microarray 7 0 DNA chip %DNA 7 — X > -> > 9"#w#L^c2:^6. Lxmm^f ivt v>a0 if V V t )V — T*>H _lf > 9 9 7 71- £> a Affymetrixttii^bS^SSX'ISS 5 ii

"T v-> a 7 t b V V X7 7 — £ffl v>T '> V zr > ■f- ^ 7°t~ oligonucleotide £ HI%VC L tz

-63- DNA7n-7£ft®U HIVMffl

02-4 A - b V 7 y(GeneChip)k 02-5 -f-b'jt^&mMm head Z> 0 chip ^ dichroic morror It, photomultiplier tube (PMT)T## $ ft & o 2h.67)#qq(±i"C(zl@:fz4P#Kr^

m&f(D^#e50%#mw#aLTv^o ttz, BRCAi(Dmmmmc&L -cj3b. < b^

> K V 7 DNAchip ####& t>Sffl$ ft i. 7 b LTV'&o * "j y a.^ — Affymetrix

#(:#%<V'&c ^ 7>M £ cDNA KSffl L tzbU PKAv>mti3fcf6C2:(:j:i). YV7 L/c Microarray b 7>%B&£ 0 2-6 tc75-f"19) c 9 mm JMM

-64- Printhead \

Baseplate

Microscope slides

Microscope slide El 2-6 Microarray ###) n v b 17° 0 > b ^ y K baseplate 35 J; 0*' .2 it £• x,y,z # tcB#14"SEttS> Wla — 9 — frhtzha v 4 ^ D y P — b PCR##^r printing tips poly-L-lysine n — b L X 94 KX9X(:iK;5nlfcx^v bf So b Kjir?x4Mt sa##to-c&So #(:/'4 XV y4 4f-y a >"06S^. Schena^(±2&<7)m^%M$-mv''C&#^e oft'l)o L T t± fluorescein t lissamine 2 mRNA -eft-e^m^Rc^L^dCTPe^fSo dfubox n-X^M^microarrayC/^4 XO X4 XL. t5o 1046Hot bJftLcDNA fcHI%L4tLtzmicroarray37ti43t*e ##LWurkatmm##K&. Lfv>* 0 C.<7)#(i. cDNA microarrayX — >^rbl:#XS±'C#&PX##"C6So L d'L. A^XA6#A6fiSo

S#-o (0(± cDNA & X 4= ? b Ltz microarray /$*j£ll(c ^S. /\4XUy4 4f-y3>C6-12#R^3#i'S^^'e&So 4^#. CfuboHmd# >5'^'. cDNA array 6 p p x."C<7)y-p 4 > b ’Cab■So —tfr. 7fe(0xE^M$fffl O micro array ^ t Ch-C&So m±

-65- Ctz i> co # x. T ^'-S t b "Cafe-So C (Dt^'uiz & „ cDNA microarray

%t)-E-9"C66o AffyrnetrWimf y7##mmi20 < %#(7)7f h 07/974 -R#T(±-f v/±C0.3 < 9 & L< (i0.2 <

RT#h^c"Cj3i), 4"# $ ^ (=$ < <7)oligonucleotide&

9o x m I) Liang, P., et al., Science, 1992. 257: P 967*971 2 ) Liang, P., et al., Cancer Res., 1992. 52: P 6966*6968 3 ) Donohue, P. J., et al., J. Biol. Chem., 1994. 269 (11): 8604*8609 4) Donohue, P. J., et al., J. Biol. Chem., 1995. 270 (17): 10351*10357 5 ) Miniati, D. N., et al., Cancer Lett., 1996. 104: P 137*144 6 ) Nakagama, H., et al., SfiM ## BE, 1996. 41 (5): P 580*584 7) =p 196-200 s) wmei, rpcR#o###*j =p 114-123 9 ) Kato, K, Nucleic Acids Res., 1995. 23 (18): P 3685*3690 10) Kato, K, Nucleic Acids Res., 1996. 24 (2): P 394*395 II) Strezoska, Z. et al., Proc.Natl.Acad.Sci.USA, 1991, 88: P 10089*10093 12) Maskos, U. et al., Nucleic Acids Res., 1993, 21:P 2267-2268 13) Drmanac, R. et al., Science, 1993, 260:P 1649*1652 14) Kozal, M. J., et al., Nature Medicine, 1996. 2: P 753*759 15) Kreiner, T., American Laboratory, 1996. March 16) Schena, M. et al., Sciense, 1995, 270:P 467*470 17) Schena, M., BioEssays, 1996, 18:P 427*431 18) Schena, M. et al., Proc.Natl.Acad.Sci.USA, 1996, 93:P 10614*10619 19) Shalon, D., et al., Genome Res., 1996, 6:P 639*645

2-1 -4 45^074-nmvt

-66- -^mwfum^&Tyn yoff-Ah(±AT(7)%mma^)-m^mf^o 7aTt-Ae#mcL -c <7)ga mk#c z a 7 /f - ##m±#a& i% /< ^ T(D##cjN- L -c, @ yaft-A^#

W'L. mm^#me#mL^mamT

(1) WHWmrn#

a. sa*-^5ctt»ac»as mam%^7cm%#(mmi±0'FarreUt: j: 0

y;i/±c%^'y h^L-c#

tf b tl%>o

1. CI'yyA2)iMX#3xlO''bp, 5f#@m^maR^#mL'c^3^#x.6fL&<7)'e, %

BM&f&#S\ c<7)@m

pH unit, sDsmmmcimL

"CiilkDa^mme^LTv^o O'Farrell<7)m#"e[if7.cofte##(:LT, 500##

-67- O'FarreU^mt&'C (± ampholyte iz£& pH EMC & v > TSSMit <7) pH Cim#^&6 2 tfrb, ±&S emet'9 LTNEPHGEiiKnon- equilibrated pH gradient electrophoresis)^'tvtV'& 6)0 &tifES.iEfSi#■$• 6K

& ^ ^ L. ^pH (D##Aa ^ B&mm c^mi-& chczb#?#^#'r^menr#cL^6(7)T*6o CL,

—4f> S?lt4tc:ELTtiiE&t:'tipH4J@S<7)B$Lt-i3V'>"Campholyte(Pharmacia)60n ^mA$r#L-cv^o t LX lPG-IEF(immobilized pH gradients)^H5E $ fiX v> h 7,8) 0 li#^& pK# <7) il £ ~M~$ h acrylamideGmmobiline, Pharmacia) 60 iSUi:£j @£ £ fE M L > acrylamide 60viniyl^^^LTpolyacrylamide #%CL^ [51/E ikf^caczi), AimL^mi)<7)^L^pH^Ke#&6^"c&6o ^cof - ?/<-BMl:$ft"Cv^6 8,10)^

##(:. %^7cW/c7 < Sfifx4 7 h60#m BPB#(OA^(:Z6%A.

$ frc # -c ^ WR & Wrt h tz&lzii^ 32PSBiiis £ Ef western blotting&1: Z 6 membrane

—•^TC'/n 7 j -;h<7)SSW#i-ov^r(±> illflof-l'®digitalfflftfc Lt©

## "T # (= f 6*: 1: Z & #m C^l L TliCCD(charge- coupled device)array scanner 18S L /t laser densitometer 4<

%####(:# L T ii photostimulable storage phosphor imaging plates 4-y Effl'4 6 imaging densitometer(Molecular Dynamics 400A Phosphorlmager, Fuji BAS series, Bio-Rad Molecular Imaging System)/5s H^StvCv^o t — h 9 z/'t 7"? 7 4

i^S'T'n 7 ^ £ n > e° j. — ^ ti J: ±SB<©>@HtW# , m#TV7 hV^Ttbr, PDQuest, Kelper,

-68- ELSIE, MELANIE, GELLAB, Bioimage, HERMES-;v

b. maRx^v &ds##T&V'C2:'C6-3 2:o R###r&[: j:&A7f-K<7)T< <%iKL^^o AC^T#&o

1) a#f#mmaM(>100kDa)C'3V'T&#f#(oa&#&&nr#I:L2:':hc 2) 4tffiB®X>*$S&[n]J:(fmole PA;i/)0

CLiL^>(7)^Mi± laser desorption(matrix-assisted laser desorption/ionization, MALDI)i£14) &>%> V'{±electrospray ionizationv£15) & i> Z> o Rs^h #&(: j: Z Cf#&7 < y m@g^iJ(7)mm(±maR7-^A -x<7)#^(: j: 0%^7C#^#(#(7)#aRx^^ 7-^A-x##(7)^##y 7 1 B

c. -;AtcW$6Kx-7a-x ##. m^m^^^^Tcy^gaRy-^A-x^m^t) tiX^'% (#2-l)o K 7 A tiX&*K main index(SWISS-PROT)5:#a461 ir -BHiiT — ^ a —x^x<7) hypertext S^i<7)7 7 -k X /$$W ti"C v>£0 Z.lz ExPASy(Expert Protein Analysis System, URL;http:// expasy.hcuge.ch/)^ sd51 $tlXi'Zo SWISS-2DPAGEL%##<%7 -X

(2) 4#?)m@ %^7c y ;^g a R#^ c it# u -cm#&m###;%-cz

$i^L§ titz S. ceremsmeX>Mi-titt 1:

- 69 - : 2-1 $a®n^7C

7 KLX

Geneva University Hospital. ExPASy ittp://expasy,hcuge.ch/ Links to other databases federated * University of Geneva 3rfift?#ffiHool, software##^

SWISS-2DPAGE iltp://cxpasy.hcuge.ch/ch2d-lop.html University of Geneva Links to other databases federated Z-DF. h 3^(7)138

HSC-2DPAGE htlp:/7www.harefield.nthames.nhs.uk/nhli/protein/ Harcfield Hospital (UK) Human heart, endothelial cells, rat heart, dog heart federated chamber-specific, disease HEART-2DPAGE http://www.chemi6.fu-berlin.de/user/pleiss/dhzb.html German Heart Institute Berlin Human heart (ventricle, atrium) federated (dilated cardiomyopathy)-specific proteins

HP-2DPAGE http:www.mdc-berlin.de/~emu/heart/ Max Delbroek Center Human heart (ventricle) federated WSf — ^ t" <£ proleinf&Sservice

PDD http://www-lmmb.ncifrcrf.gov/PDD/ NIMH, NCI Human body fluids in disease states federated t tern

CSH QUEST http://siva.cshl.org/ Cold Spring Harbor Laboratory Yeast, rat fibroblast cell lines, mouse embryos partially federated 2-DE based protein database?)##

MitoDat http://www-lmmb.ncitcrf.gov.80/mitoDat/ LMMB.NCl mitochondria partially federated m i t ochondr i a^BE# jftfS T" # ft?

PMG Argonne http://www anl.gov/CMB/PMG/ Argonne National Laboratory Mouse liver, human breast cell, Pyrococcus furiosus partially federated s.e. i; i z mmitnmm

Cambridge 2D PAGE http://sunspot.bioc.eam.ac.uk/ University of Cambridge Links to other databases partially federated entry point

YEAST 2D-PAGE http://yeast-2dpaee.gmn.gu.se/ University of Goeteborg (Sweden) a number of species of yeast, S.cerevisiae focused partially federated S. pombe, C. albicans, R. glutinisIS

YPD http://www.proteome com/YPDhome.html Proteome, Inc. S.cerevisiae partially federated yeast proteins^# 1-Mf"^database Institut de Biochemie et Gen clique YPN http://www.ibgc.a-bordeaux2.fr/YPM/ S.cerevisiae partially federated 322 spots?)|s|5£ Cellulaires (France) rat liver, mouse liver, human liver, plasma, com, 2-DE maps http://www.LSBC.COM/2dmaps/pattems.htm Large Scale Biology Corporation partially federated IEF/NEPHGE 2-D pattern wheat, rabbit psoas muscle Maize Genome http://moulcm.moulon.inra.fr/imgd/ Institut National de la Recherche maize partially federated maizelcMi'h relational database Database Agronomique (France) 2D-PAGE Protein http://tyr.cmb.ki.se/ Karolinska Institute Drosophila melanogasler: major body parts partially federated Drosophila^Z-fi& database Databse Biobasel2-D PAGE Human keratinocytes, transitional cell carcinomas, human skin, cancer cell# http://biobase.dk/cgi-bin/celis University of Aarhus (Denmark) partially federated Database bladder squamous carcinoma, urine, MRC-5 fibroblast, 2-DE database mouse kidney University of Michigan Medical EC02DBASE http://pcsf.brcf.med.umich.edu/eco2dbase/ E. cali partially federated E. coli proteins# MHfl Center *ExPASyL Z 62-DEy — (###&. hypertext cross-referenced =t S-ftilT'— 9 7, ^<07 & -b^^) & Acti. LA: t> 0 ho X ffi

1) Kahn, P. :Sde%ce, 270, 369-370 (1995) 2) Wilkins, M R., Sanchez, J.C., Gooley, A.A., Appel, R.D., Humphery-Smith, Hochstrasser, D.F. and Williams, K.L. :Biotechnol. Genet. Eng. Rev., 13, 19-

50 (1996) 3) Wilkins, M.R., Sanchez, J.C., Williams, K.L. and Hochstrasser, D.F. ■.Electrophoresis, 17, 830-838 (1996) 4 ) O’Farrell, P.H. :J. Biol. Chem., 250, 4007-4021 (1975) 5 ) Dunn, M.J. and Corbett, J. M. -.Methods Enzymol, 271, 177-203 (1996) 6) O’Farrell, P.Z., Goodman, H.M. and O’Farrell, P.H. -.Cell, 12, 1133-1142 (1977) 7) Bjellqvist, B., Ek, K, Righetti, P.G., Gianazza, E., Goerg, A., Westermeier, R. and Postel, W. .J. Biochem. Biophys. Methods, 6, 317-339 (1982)

8 ) Goerg, A., Postel, W., Guenther, S. -.Electrophoresis, 9, 531-546 (1988) 9) Bjellqvist, B., Pasquali, Ch., Ravier, F., Sanchez, J.-Ch. and Hochstrasser, D.F. -.Electrophoresis, 14, 1357-1365 (1993) 10) SWISS-2DPAGE Homepage (http://expasy.hcuge . ch/ch2d-top.html) Technical information on 2-D PAGE 11) Miller, M.J. “Advances in Electrophoresis ” Vol.3, p. 181, VCH (1989) 12) Dunn, M.J. “Microcomputers in Biochemistry :A Practical Approach ” p. 215 IRL Press (1992) 13) Gillece, B.L. and Stults, J.T. -.Methods Enzymol., 271, 427-448 (1996) 14) Hillenkamp, F. and Karas, M. -.Anal. Chem., 60, 2299-2301 (1988) 15) Fenn, J.B., Mann, M., Meng, C.K., Wong, S.F. and Whitehouse, C.M. -.Science, 246, 64-71 (1989) 16) Billed, T.M. and Stults, J.T. -Anal. Chem., 65, 1709-1716 (1993) 17) Henzel, W.J., Billed, T.M., Stults, J.T., Wong, S.C., Grimley, C. and

-71- Watanabe, C. \Proc. Nat. Acad. Sci. USA, 90, 5011-5015 (1993)

2-2 mm 2-2-1 /f v y a t1 (i) y/Amimt'Tvkryy 3 f ;i/f n-:: yv

<£ 9 (w, v 7 \£y xfrbb&t 0 ? n — —>?\ v — >y&b'v'> < o

DNA_hM3&Bftl (v-*-) v-#-F4«jSfSMf&;i^£ & t Kv-

ZOcM&m 6 DN A 8d£iJ 2) 20 ~ 30 M^MgdmWiJUS 9 IS LTiS& -BSO ISLE^cbJev^w &cDN\ X

S» <5(b###(c^6o SE(w7 7 ^XcTiWeissenbuch^r'P-'ll'

&<#b:K&o

^mmf5f^7vy"e#&cai:»3o c^z^cL-r, i-ei:5000&v##^yyA ±(:-7 vy$b,"Cv^o 9^< "?

-72- (cM), DNA(7)

(2) V v h: yytcEvTti, A DNA £ * n- >itt Z> z h /M&#

"YAC (yeast artificial , Bf^AX&fe#) /$'fi]ffl$ii& = YAC& 200 35 ####a O E A DNA & ? n - Mkf & c & #"C # & „ %(: Cohen (pX C L A y ^

-y^, A2d'-c(i&&d;%hy/A^#(:#-3TYAc?)#?ij?a-->mim (#mma) CKDlfegl^fflV'-r^lf LAv^M<7)YAC y n —

Ch^-C#&o ACYAC^o-yDNA^BAC, 3%< K^Pl77-y^^^#rL^ f V'±#m<0/

y/ADNA<7)^a-z:yy[:igEv'-C, EADNAm^t:##X

i) amomRNA

($A(icDNA) t<^)/^yvy^4f-S/3>, 2) (3&(iHTF) 3) J:^v>i'9v^>ym, 4) Ufa'?

[zoo] ynvT/f>y^^^*&&o

-cm# (ma) ^

(3)

LhyyA^)^ y-y 2(7)#MC#^y D-z:>ya y-y J:>y >

3 y;vy n--> b&foti&o X

m#DNAXy y;i/^ 6g#C##af5f e|W|& 2 t

-73- 1997$ 2003$ 2 0 10# (t hy/At'>-7i>y>7^7) (SSSWS^bti V * 7- AcotiJS)

I 8# (**) DNAiJ->7’JU e# (*$) DNA?>7a. (30®@S) [ m# (**) DNA»>y;u j

-?y\b> e> y

|me?®a«ft$ (ioos~iooo7jts*) |me?stitm#t:

wmm/YAc=i> r-f VUE ±>f J UX\V>-'^'1 YACn>?-f f (iocmiSSX&?n-»

yyya-->y

BAC, 3J5Ft!l0]>T

%tv> h7-ykf>yz y-yx>v>-2*4'if

I mime?* I | msme?s |

*SttW SSCP4 2 smt&a

| $s (®H) me? | | mm (®b) me? £n (®H) me?

no cloning no mapping no sequencing no cloning no sequencing

m 2-7 * yyati^D^^ yoMOTSlE

2-2-2 yy A(7)####ctT;i/^#(D$ijmiiXM^v>o n hen c, mmc, &o r%J &3A#"C&&d^

t''Mffl\EifrbilL 1g&MM<0't'tAstzi8.B.%§.bLXCD\'fc>! t£ F#J <7) ##myy A##rnmma&l%@d'&#{t#mm%ii::&yy A Bd^Jtv'-)^tt5j*x.«*v»#jniiiifi£§ 6Kim x.oo#> a £>#■?, c & -e, -E-^#m#(±3E(:i#Lo"36&av'x.6o W-(r(i, (##, ##), mm, ya

m±#re (±#cy/ A cfrx.a c a a o i- 6, ##KimA"C, MW^kWiMb Genetic Footprinting,

-74- ? 7'mm # ## b DNA 7 -y 7s1- «t h Genetic Footprinting & t'

mat(± Brenner LT(±#&y/Ay-? Jiyamf f A2:^#"C&&o C2"C(±-e?)#A"C&6i&%

Lj:9ki-&#&&&%0±tf2:o

y a 7 v a 7/^'etiitfS^, iStlfii D , jx --?&

(7)#m^h LTy/f & b 9 >xy? y 3 ^mm^^^^Z^fmrnodulatorem^# 2f6%V ±tf&o v7xt±t: Hi&ca&ia^odt^&k M-#fcfi®Wi3a7c*e§s^*e*So "C t± y y A 7 f K <7) gene trapping ± if-So £. tz^T^'^'J 7frbW-^M f'^7 7 &1: ##C#y #^#^T(Dyy A

$- #m L ^ mfxf m#K t: :mf &####?)%%& m#s L %. L-cma%») ±tf & c a c z ^ -c^ A

X##^$jML^L. (±W A#&t'%%9

mbx^coffl @&#&f h iim&Mltzx&sxLxm$*%x^<

mmc, cfuiv'Nf F%J c

-75- / AKfJ k 9 v>#)n##&)nx.'oc$,& h 9 *x. A6-e#

6fi&o *#"C(±. (mm. ##). #m. va^yaO^J:. 7tX

(i)

A,a-ct. z^i? mx.^,Ac & ^ m^A-Cf-CC. /f>7;i/JL>4f#m. '7/f3Y7X'7. ^1##^^%### m. ^mm[:mxT%#A#-e^&m^m<7)^yyAg^j^^$fi'c^i). $6c# M^cfu:mc9kL"r^&. < x%yyAii$,6#gE(7)y- ? J: > X@B* ^ t c T(i. I). zw<&,M(=&9'^6&o #

669 ^.

^^m-e L-rx. ^ ^ „ #&##&?<%#### a&a&#efm#h 9 LTiGMmEfrCV'&o #CAy/ Aga^ij^

ra&5EB#tmm^eurofan9 ftisi#^aiitiiiif^tt*i:i^^ *-

zzr(Dmf%fm#&(:ML"C[i. PCRe$ij mLitaf-?/( V b^#MLTw&c a->{k

-76- pcRame-eossaem: h? >x?f-

L"C##(:CMlt£;fi& = fc£LC<7)^i£-m7t5fc<7)gieEM<7>v

— t ##^@^2 t bfiXis D x Kanr ^ S.cerevisiae

##*(D C M L "Cli EUROFAN T (±#4r & -e<7)^ W ^ 96 ^ "X I/- b"C<7)%#l:j:cTXXU-->X%[:^fcTV'& jz-pT^&o ##&#c%#%m?)#|i#:d:EUROFAN&<%"?, $ & & LWo) u^uw^x x v --

6(O(i30%@Jg2:d'&0%V'Z9T&&o CCT^MEA^,

6&2&?&59o < . c ft 6<7)#^c t $-T #^

^ & C 2: U Z C Tamm^T & ^ 6 (f, X l± C &&<%-?&& XtmmL, -eoEEto modulator t LX0$m

?y 3 V-X$r 9o

& & w±ma mm% c##-e # s y x f A & mm####-# a u-c^t s 2 a ^ 4"#(±#3g(:^^'C< &T&%9„ #C##l:"3V'T(i##m7,Ni'?--'FDNA&&T,' >^SAGE&t: comparative cDNA sequencing 6&Wi message display & U69Sfr L v ^ 7 7° n — -f-

-77- iWii^oSS$c>M^:+ m $ frc ja i). ####?<%#%/< ? - >& a z 9 ^ci6$fi&2:V'9o b . Genetic Footprinting

(ri^#L/vfi"#Lv^^h L"C #;&# K fi @ $ fix v > h

2fi2:TyM#y7/f'7-'?PCR&fTVx Cri,^ DNA'y-^^>^--e^kS-i^o eiJx.ii'200bp<7)/Vy Kt±#Htoy 7 -f v-tf'£>^

200bp#m:Ty;M#AL2:?n-:/&$L, -E-(riBRW#%^^i-

A(f4-##2: #G?xmc# AL^TyA^(ri/<> Klif c(^b 2:%irc T7 v b^V > bj X #2:&b?)afK?m'&'c&i'LW\ xagf[:%mt&7 7 byv>bzb<)$6(:±m

tix^&o %t£h\zgf £fflvT6£#n 5¥k&®- t'{&$ky^fr6K&C2:^#^(ri(:#L"C. #6(riGF"e(±-iB-#mf5f &LG*c^L ^ g,»^AT66 9o CLTairW(:#B(<7)#6?md'^ GF^

-78- Z9(:L"C.

eRi^-e# &c<7)j;9 < >ynT#&%gs<7)j:v'79^'7-&7''iM

C . /7i’^~/<‘~ 3 “ 7s A >/ ±BEO 2 olr^^^-b-tir/c X i &T 7n —7 ^molecular bar-coding (MBC) "C&&o fej- IZ3.~ — ?t£9 'f'ttsh 20bp Ot1) 3"?—fcj$ALfc3 vy y 3 's&ftMi'&o -3 * (I ####CA%^DNA2: xecdj-&t)(fC63o ^CCfLgr GF trati n y 3 >£jWt, *tf>iiiM&-e£DNA Zim-TZ. cmDNA (:#ALzt:7 f V ^ 6-;^:f V d-f-yyc/^ 7V l) f V 7-f y 7(:

y-C6&o tv=fyy7a#<%###(7)imm6m&a[±v';L, c ^)y%TA^#^f

7 yn-yea&o t t±\ MBC(±nTE-e&^>o ztitz(Dtnt? ffl«f y 7ti^^-e § £ 69 £Mt)X~&2>o d.

cfie?#f67ayyA(±psoRT^a'^#$^T(iv'6^ 9 it±v^ t*c%^v^o C *'CV'&(0(iYaieA#^y^-7'e*&o %^^lacZ^GFP^^(9l%^-^-miSf e A(:##yyA(:#AL^9i77U-^^m.

-79- ^ L-c^A

c fi a $ fLT V' & „ (2) #a MC. elegans 1 L T Brenner—# (2 J; ^ X £

mia-ciayny^AmmE. *-XA(D#%c#A^ %"C&6C2:em#iEL^:#t j:V'ei|T(±^^69^o m&oim#z s,^cf &±-c±# & #.&?&&k#x.6fL&0 #^m(7)/J'm(:Z6m&cDNA(7)#$f'CI±, a obitmaomigim 96^7f-"7v H:j:& ###in situ hybridization £ fr o ^TiS^H^LT & ij tig £ti&0 ttz GFPStntScS

tz%■eSS't"5*) > 2f(0 j: 9 < (Dd'#BW&& t C6"C*&o ^ ^mat? i± b ? > % ^ y y i: 1 "T#T & & o i- ce c m miiBa'ciiaac w%$m$fLTV'&2 hd'6w»& z 9 c. #s

'AmicmngL tz-e& «k B»&if£fflvTifx.sroi/^-et± ^< -c, 9o (3)

-80- b x-ii"^ 9 t — v X y t ^ v y 1/

Lv'#am,

Benzer L a #$f C & & C h (±# 9 ^ "C t ^ $ 61: B

^&"C6%9o > b(: b 7 v y&^6'%^(Di-Cfi^##f^e#ATV'&(7)6#^T6&o f (QKm-eyy A(D## (±-5-t7)x 9 — b ^'iEibifcStiS^&vtfx USBerkeley <7) y';v —tp'll'i-i 9 #

/<%? v --yyf^Tya-f-emv'Ty 3 ^ y a ^

^Rubin 6I±. 2ft^(D/

#mrn&modifyfaafsf&%?v->f&77u-f-&;#a6LTv>ao A#ibLj:9 kL-C^f). 9^ ^fi^x7-;vr v'f-fhz. ttf£z.tTvSM%o ^### #(:L6 Rubin 6(D7ya-^CL5/^miz^^L f * & 6 9 o (4)

7^x©tfM|iLto#felit b(:#6,&v'&v^9L^^cTjatC

####2:c,(7)K&#/f > H:&So k 3^-^y n —— > Vco^ktz LX § fclxS!H±®£>‘t‘:S§v^/K ^CT'livo v> 9 * y 9 fffo-otz o %t>tbX informative ^,##6& W±#E^:M# ^ft^A&'o (i Z 9

#^c(±L($■ lcrnfR#wc.

-81- L t t'X. & V'#di@"CU\ t ^ 3(7)^# t»&T&69o mu fC-C(i#m&#yy3f 0 o ## 2 ^7cS^$ktt5rfflv>/z RLGS-Spot-Bombing t£(± ^ & y-;i/ $r##f & 1 # G Reelermfz?^$#^^"C'E-<7)#%%^$fLTV'6o ^^"7

^%m^6D/\AX ?-&^jfifSM-t&&V'j: 9 ^##CL^r;v7591 i!5tz &v> m-e-e t ^ cmaaGf & a-cf cft-c 1'5= IW9&jgL/tv-*-£M C##f 4JSiLtli PCR RDA &o YAC. BAC^ir±m^a->^v^4:^#$ft"C V'&ws-s-f)&

$^-7^xhk KC(±#V't#V';l, a^fCZc'Cli'v^x&k b h<0*v^(omAg &S$E£iJtn h Yto"Cab 5 0 o $£:transgenic^knock-out(0#-o "C6##abh v>t±B#S#Ht£ta if

nooab i) > f c "C ti § fc>#>TMab £ v ii x t - -y #B14<0 ^ v 7° y/A##a #mT&"Czb6oy/^E^Jy-^ h(OV j: b, 2(OZo^7n^-^-&|@i%, 4#2ft6m##

"CL^^O hV'0

Lexicon (Oi 0 ab ibbtvcv&o

82- RACE Bret 9 7

t|6]^x.&y%TA^c»< %2:-M^?&&&LV'c C(7)Z9^#fEf%##^3T%yS/

(5) tT;i/^#i$m

r-tT^vj ^6C(±. rn b

2: c5#y/ Af-?fi<7)^%@ra (:-!-<7)y /

>?-$? bgr^WLT zkt)^ct,6(7)yy

ffL&g^TMZ: ^ jt% "T# ^ ^ ^ t, C ^ t * 6 tf ^a ^ 6

^fiJfflL j: 9 t-f %>$.*%&b1rz> k, CLti^'li&t <0 1-

0$;MgV'o

-f C-C^r#6DC t &o*£TO<£>7* —* £v> < L/t j; 9 &r-5"< -%##6fu& Z3 o #ijx.(f XREFdb(±@##<7)Amfsf 2: k b EST t

2:*"CV'&, y a-z:>ye$#$fi

^ j: 9 bmfKf 2:^<2, Z 9

##2:% bEST$riKU'#^^7'-^^:-%t##$fiTV'6o Clfi(±y 3^y 3^/

^##2:k b^m2:eii$y#it&nr##e#*()^6^'e*i), @*-c#i$gev'6?)T&

M?#&Z9&^XTA$fioo* 6. &#&%(:##%?)#V'&m-f#(:#&<%>

2: (±.

j: 9 C^^2:, ^2f##(9^K)^f%v^

$W:fo#MtW3f?&5 9o

-83- z cmmwfmna® & 2 b v'o # -f* tz£>#####->xr A<7)@#i$(i6camm$^'c^&nr#'e&6 9o

3. 3-1 3-1-1 uwvmm&omm (1) ^$ij##%(07a-^>y

a. n- - > y cDNA<7)%#t: j: 2(OcDNA^probel:L"C#^® 6 y 7 A DNA & 9 ! 7"9 V — 6 plaque hybridization ® C 4 D %#L, Si^FSTOiEl^J £B£®6 2b/$lEMMf;b;h-Tv>;E>1~4)0 ^eiJWE?'J

£ $5¥ H© <0 IS] CR- <0*5^- DNA@B?|J%&o lllfgli L*C(Dl± -m vitro SoklFin vivo <0 primer extension, SI mapping® 5® ®(i reporter assay ®7\ © ii DNA-S 6 ®Z#ffl iz o v > T (± DNase 1 footprinting ®8 \ DMS protection footprinting®® DMS interference footprinting ®% UV cross-linking ® 10\ gel shift®1® S ti W HUffiS#r U o v> f i± chemical cross-linking® 1® 3&#Ett%® affinity tag ®x® two hybrid system ®15) tvtv^o $/"c|5 ^B©<7)^ n —— > 71: (1 Southwestern blotting® 16\ far-Western®17) ijffl $ ft Tt'&o

mRNA (0 5'%C##® & cap ##*fUffl L T^6 cDNA £HE® 6 ##<0^®^#

V -WE$tm^o 2(091 7'9 V -

-84- ^ b AT v y#a

B#<7)T-9 'X-X(7)##^'g t £X'X-XOT6i"/E.o ® Transcription Factor Database(TFD, NIH, gopher://gopher.nih.gov/ : 70/00/molbio/other/atfd)> ©Eukaryotic Promoter Database(EPD, EMBL, http://www.genome.ad.jp/htbin/www_bfind7epd ), ©TRANSFAC Database(GBF, http ://transfac. gbf-braunschweig.de/) ho ttz local ^f-M-Xtlt© Muscle-Specific Regulation of Transcription; A Catalogue of Regulatory

Elements(LANL, http://synapse.lanl.gOv/muscle/HomePage.htmD/5s#tf 6 „ Cl;

(iNIH<7)Ghoshi: j: / release 4.2(1992)(±1447@(DK^B#^## $fiTv^c ©t±EMBL HSSi$nrv>/B eukaryote <7)Xn7-X^x

-x-e^t), release 47.0(1996) i± BE 1,270S<7)7°n^-X -cblSS@dfiJ(762kb)75sSS ©liGBF(#)(:Z()m#$fi"C^/&

# ^ L/^$E#$!i#@Bflj/K®B#(:IS®6 relational model database "C&&0 release 2.6(1996)(i#k%:B#m%1601@, ^##KMm%43O3#^#m$ffCf3 0, l£^H#tcH8Lrt±SiS, ##, #S#M®&B##(A##,

l' V 7 ?x(169#), 7 < LtK^B#

is®®&Me^B#*si=r@e?ij> sm^ttsu#

b. 'sT>'y 3i-iv? n-->x & %> ms# £smt-xn-->x®&&

— 85 — (D a $- m# a L T$iimf &

Kinzler & VogelsteinCZ ()whole genome PCRii XlcfttSo DNAEtf-^fi^B^E^U X Cf protein A-Sepharose %fUffl LX5XM. L> DNA BrH" & HI JR-T^o DNA 81 It £ catch linker OgeJIKiJttSL^ primer LTPCR Ci I) iftS catch linker0C##f^ EcoRI site £ fijffl L T 81 It <7) W3I £ EcoRI "CM U ^X X-UX n-- >X'T;E>o DNAKfM-^^$iJ##%2: CfLeya-yh LT#M

Xenopus transcription factor IIIA22) id X Shuman retinoic acid receptor a 23) £Ik

X n —— > X'$ •etiS^:§fiTv>^v>0

Hb^BfZ: DNA##&#&#&- ho -t;vn-xx -f ;PX-#%(:Z t) [HOT5L£24) 0 JfcXn-^xx’iwO^T l=L 0a^$fi^:WK»#^Kr(7)DNAKrM-^/yLy

xxJm%F$%Ltzo estrogen receptorX>S69itf5^-<7) m#C%e$iJML-Cv^o estrogen receptor # DNA8 &±##C%% L T±#C#R#U 0.2-2kb<7)''MX<7)DNAI#rlt#5am ^#5 n^o X X'i- & C. b L X v 'b 25)c b ii«i DNA in vitro (iScfv>-cv>5 in vitro X>Jj&{z J; 1) & DNA811t^f% vivo tij3V->‘V$K®F

-86- &c in Gould6(±

kf XX (Ubx, homeotic gene product)^# t KtB £ -X> Ubx b DNA*fM-(D#^#:e% b V X b 7 U x > - 7 X tr - x tf - X £ f'J ffl L "C @ JR L tz o in vitro

DNA £ x 4 7*7 U - t5' tb L, Ubx <7> #P Sr £ U *C v> £ - b i in situ hybridization (O## U J: b SB L "C v > 4> 26) 0 M^I±±E^&(7)acAi&^mouseX)^"CM%L"C^&^o A"C66o 1) X n vf > paraformaldehyde U J; b5C1 b a b&^Bf ##L#< b## &2k;m'#(:&5o 2) DNA MM" in vitro^(Z)#t)b(:. DNAKUt^o-7^Xa-XhL"C plaque hybridization ’SrfiX'x 4 7* x V — b3 U X n — >/i$SlliC. h &SSi* 6

LfcC. to in vivoffia'b |b) D#H?tt£r in vitro Ino*{±iF$ 46 inwgroiie'a"em#tL^miR(i^m^^#^#x.^fi&o 3) iK^Bf^DNA# It £ 7” n — 7’ U L T $1# $ X 7 A DNAUfttEbcO exonSShOSStC cross-species hybridization XtiJffl LAX to exonSttiifif$9"CSS5ii& Cl h Xfijffi LA £> <0 -c^&o c*^i-&cDNAxn-7g)%#i±%m#g)@^m^f^#'e"mm

"Cab-S/SL X1 (± itfStFto |wl 5e^ n]" SB "C ab & c ’HsEffl L "tmouse #$IJ mfL-T l(2)gl b S v 'ffi IWItt £ #i" 6 msri-Z $ti*ti>&o ©1%^-x-mfxfbx>x^x>$-#mLx. Xa t - X Casadaban W: Z b XaJEMtD bacteriophage Mu 4>28) 0 Dang ^ tiTcJUBUti vt mini-Mu::ZacZ L "C b-galactosidase iltsf- £ StU ft] MtX)^$0f|]M60 $IJfSpT U £vAx77'x V Ii::©7^7 'J-^MLtil©|i:UJXg?H f- Hap2p(CCAAT-box binding protein) to #J#T to )! f5bF £ SS* L "t v' & 29) 0 Hap2p % mcj: b*^M#(i)^fkf&Xa-7^26#mL, f Hap2p#J#T(:6^Ck^m^"C&^^mef (iCYTl

-87- o tzo Bellen b Drosophila P-element enhancer detector b iSj#60 A?i i! L T v -£>3010 b-galactosidase l/^° — ^ —it Drosophila^ h 9 P- element tw#A L/c £<1 b !3 X 0 transformant IrfEMi" £ o b- galactosidaselBlSf j:y/',t:Z DI k ^SU Wagner-Bernholz 6 (i-hSBcO^tfe'CftMS tifzffi 550 fS<7) transformant strain ? V — — > 57"'t"62 h C j: 0 x homeotic gene Antp j: 0 ^ 6 strain Lx Antp tc X 0 negative &BU ^fs^fiJfPT^itLH1^ homeotic gene salm"C&-5> L.b ^MBMLfb 31)0 ##(7)^(tobacco, potato, ArabidopsisltcisivxT & b-glucuronidase £ l/*° — ? — Agrobacterium <7) T-DNA ^ — h A & Lh C t 0 cO'fPlA) promoter ^#mL^transformant^##$ft^\ %^#%(:#m$fi"CV'&^x

^

2) $K^af KC Z &### 6 fix ^ ^C (i y ^f ^ L

knockout $fi"C 6« Hh (3v7x tri> homeotic gene productx steroid receptor superfamily CMi" h knockout

M~t %>&Msb L T (±#1W L (± differential hybridization ?£35) ;5sSiB $ tiX^'tztih #H1 subtraction^^) (i%Am^%^i'TM#$fix PCR l#$M^#A$fiAL b t- X h suppression subtractive hybridization &37'

§l$|‘/n 7 'f — ;!/##%#L L’Clil) DNA sequencing ##L j: %> BodyMapvi 38) > SAGE(Serial Analysis of Gene Expression^ 39 \ 2) DNA Writ### L Z &

-88- Differential Display i£40,41)> Molecular Indexing v£42)> SAFE(Selectively Amplified restriction-Fragment Electropherogram)?i 43\ RLCS(Restriction Landmark cDNA Scanning)?^ 44) > 3)hybridization#$r 6 Sequence Fingerprinting High-density cDNA filter hybridization &46) > Microarrayed cDNA chip ’}£47)> High-density oligonucleotide array 48) . 4 ) 7'nft-^ throughput f-^^-%2:m#L^cDNAKr

Saccharomyces cerevisiae J U tiT jo D 49) > Is^BU v YPDy'-?'<-X(D*T=fV-#m[:

(148#) f #E#afaL"CM&$^^m8K(±156#. ^(7)^38

# (24%) 6,ooo#2:m&$fi-ca D. ^ 3oo#am#<)ft&<7)T, mm

mi-nknockoutm^wtmm^vy j x 0 1) Maniatis, T., Fritsch, E.F. and Sambrook, J. “Molecular Cloning. A Laboratory Manual” Cold Spring Harbor Laboratory Press (1982) 2) Glover, D.M., ed. “DNA Cloning - A Practical Approach ” Vols. I & IIIRL Press (1985) 3 ) Higgins, S.J. and Parker, M.G. “Steroid Hormones - A Practical Approach ” p. 99, IRL Press (1987) 4) Latchman, D.S. ed. “Transcription Factors - A Practical Approach ” IRL Press(1993) 5) Williams, J.G. and Mason, P. “Nucleic Acid Hybridization - A Practical

-89- Approach ” p. 139, IRL Press (1985) 6) Weaver, R.F. and Weissmann, C. -.Nucleic Acids Res., 7,1175-1193 (1979) 7) Gorman, C.M., Moffat, L.F. and Howard, B.H. -.Mol. Cell. Biol., 2, 1044-1051

(1982) 8) Galas, J.G. and Schmitz, A. -.Nucleic Acids Res., 5, 3157-3170 (1978) 9) Ogata, R.T. and Gilbert, W. -.J. Mol. Biol, 132, 709-728 (1979) 10) Chodosh, L.A., Carthew, R.W. and Sharp, P.A. :Mol. Cell. Biol., 6, 4723-4733

(1986) 11) Gamer, M.M. and Revzin, A. -.Nucleic Acids Res., 9, 3047-3060 (1981) 12) Aranyi, P., Radanyi, C., Renoir, M., Devin, J. and Baulieu, E.-E.

-.Biochemistry, 27, 1330-1336 (1988) 13) Lee, W.S., Kao, C.C., Bryant, G.O., Liu, X. and Berk, A.J. -.Cell, 67, 365-376

(1991) 14) Smith, D.B. and Johnson, K.S. -.Gene, 67, 31-40 (1988) 15) Fields, S. and Stemglanz, R. -.Trends Genet, 10. 286-292 (1994) 16) Singh, H., LeBowitz, J.H., Baldwin, A.S. and Sharp, P.A. -.Cell, 52, 415-423

(1988) 17) McGregor, P.F., Abate, C. and Curran, T. -.Oncogene, 5, 451-458 (1990) 18) Mamyama, K. and Sugano, S. -.Gene, 138,171-174 (1994) 19) Caminci, P., Hayashizaki, Y., Albertsen, C., and Schneider, C.:Abstract for 18th Meeting of Japan Molecular Biology Society p. 292 (1995) 20) Clontech Laboratories, Inc. -.Clontechniques, 11, 2-4 (1996) 21) TGS :Annual Report of TGS (1995) 22) Kinzler, K.W. and Vogelstein, B. -.Nucleic Acids Res., 17, 3645-3653 (1989) 23) Costa-Giomi, M.P., Gaub, M. P., Chambon, P. and Abarzua, P. -.Nucleic

Acids Res., 20,3223-3232 (1992) 24) Inoue, S., Kondo, S., Hashimoto, M., Kondo, T. and Muramatsu, M. -.Nucleic Acids Res., 19, 4091-4096 (1991) 25) Inoue, S., Orimo, T., Hosoi, T., Kondo, S., Toyoshima, H., Kondo, Ikegami, A., Ouchi, Y., Orimo, H. and Muramatsu, M. :Proc. Nat. Acad. Sci., USA, 90,

90- 11117-11121 (1993) 26) Gould, A.P., Brookman, J.J., Strutt, D.I. and White, R.A.H. -.Nature, 348,

308-312 (1990) 27) Tomotsune, D., Shoji, H., Wakamatsu, Y., Kondoh, H. and Takahashi, N. -.Nature, 365, 69-72 (1993) 28) Casadaban, M.J. and Cohen, S.N. :Proc. Nat. Acad. Sci., USA, 76,4530-4533

(1979)

29) Dang, V.-D., Valens, M., Bolotin-Fukuhara, M. and Daignan-Fornier, B. -.Yeast, 10, 1273-1283 (1994) 30) Bellen, H.J., O’ Kane, C.J., Wilson, C., Grossniklaus, Pearson, R.K. and Gehring, W.J. -.Genes Dev., 3, 1288-1300 (1989) 31) Wagner-Bernholz, J.T., Wilson, C., Gibson, G., Schuh, R. and Gehring, W.J.:Genes Dev., 5, 2467-2480 (1991) 32) Fitzmaurice, W.P., Lehman, L.J., Nguyen, L.V., Thompson, W.F., Wemsman, E.A. and Conkling, M.A.: Plant Mol. Biol., 20, 177-198 (1992) 33) Lindsey, K., Wei, W., Clarke, M.C., McArdle, H.F., Rooke, L.M. and Topping, J.F. -.Transgenic Res., 2, 33-47 (1993) 34) Hsu, C.-Y.J., Komm, B.S., Lyttle, C.R. and Frankel, F. -.Endocrinology, 122,

631-639 (1988) 35) St.John, T.P. and Davis, R.W. -.Cell, 16, 443-452 (1979) 36) Sargent, T.D. and Dawid, LB. -.Science, 222, 135-139 (1983) 37) Diatchenko, L., Lau, Y.F., Campbelll, A.P., Chenchik, F., Mosqadam, F., Huang, B., Lukyanov, S., Lukyanov, K., Gurskaya, N., Sverdlov, E D. and Siebert, P.D. Proc. Nat. Acad. Sci., USA, 93, 6025-6030 (1996) 38) Okubo, K., Hori, H., Matoba, R., Niijima, T. and Matsubara, K :DNA Sequence, 2, 137-144 (1991) 39) Velculescu, V.E., Zhang, L., Vogelstein, B. and Kinzler, K.W. -.Science, 270, 484-487 (1995) 40) Liang, P. and Pardee, A.B. -.Science, 257, 967-971 (1992) 41) Ito, T., Kito, K, Adati, N., Mitsui, Y., Hagiwara, H. and Sakaki, Y. FEBS

— 91 — Le%., 351, 231-236 (1994) 42) Kato, K. -.Nucleic Acids Res., 23, 3685-3690 (1995) 43) Suyama, A., Uematsu, C., Suzuki, Y., Kambara, H. and Sugano, S. :Abstract for International Workshop on Recent Advance in Genome Biology of Micro-organisms (Chiba) p.35 (1996) 44) Suzuki, H., Yaoi, T., Kawai, J., Kara, A., Kuwajima, G. and Watanabe, S. -.Nucleic Acids Res., 24, 289-294 (1996) 45) Meier-Ewert, S., Mott, R. and Lehrach, H. :Abstract for CSH meeting of Genome Mapping and Sequencing p.211 (1995) 46) Zhao, N., Hashida, H., Takahashi, N., Misumi, Y. and Sakaki, Y. -.Gene, 156,

207-213 (1995) 47) Schena, M., Shalon, D., Davis, R.W. and Brown, P.O. -.Science, 270, 467-470

(1995) 48) Lockhart, D.J., Dong, H., Byrne, M. C., Follettie, M.T., Gallo, M.V., Chee, M.S., Mittmann, M., Wang, C., Kobayashi, M., Horton, H. and Brown, E.L. -.Nature Biotech., 14, 1675-1680 (1996) 49) Oliver, S.G. -.Nature, 379, 597-600 (1996) 50) Murakami, Y. :Abstract for International Workshop on Recent Advance inGenome Biology of Micro-organism (Chiba) p.44 (1996) 51) Davis, R.W. :Abstract for CHI’ s Symposium on Gene Functional Analysis (SanDiego) (1996) 52) Smith, V., Chou, K.N., Lashkari, D., Botstein, D. and Brown, P.O. -.Science,

274, 2069-2074 (1996)

3-1-2 a A LT## L.

100bp@aWTC. B

— 92 — mf&z tc DNA^-e-#^>/^#af^mmi-&DNA (i-mcmbp -+bpo , „ DNA^^ ^#:(7)^ $ &CBE&K7)DNA$g'&#? <7) < . tb a.y/ fi-ctz ^ i±# g-e(±^^o at, -ecc^f&K^$(i#af<7)## 6 z h#mg"C6&c

(i) t # x. ^ (7)±m#i&<7)Tm(:^ (7)##%# e t

*#$IJ#l#%<7)##^^#f&ZaM#&o Z(7)^, @4

^-misf z & ca o-c. %i-&za^"c#& (Bis-Do -f&tdx ^<7)mm^T(7)##&]E#(:#;#i-6zaM#'C6&o L^L,

$^c. m%iie%(:j:^-c^fL^(7)DNA&#i-&mme

< mg#; aumdt&^a & &„

(2) y;^y 7

&. y^5/7hm'\ 7 7h7U>T'&^^^(T)#8iJ^^m^^2:L»v^&^6, m

-93- 5’

Reporter Gene

h

133-1

M|6]3) £fflv>£k<7)^Se@7°7X*>£,S4\ S^E»E5) ^AFMCAtomic Force Microsicopy)"'^k\ @4r(D^

m&. mm.

y;i/y7h&T(±. #$m(:DNA&#mi-&t&. #m^&hL"C(±32P#(=Z&%

L^'L. y^S/7hmcmv'6fL&DNAKfM-(±m^l00bp#KmTt

DNA«rM-msp<7)^mmmme#mi-&2 2:(imLv^ %^-c.

&c a##L<. $ 6c$6^w#af&c 6 2 2:^ r#. V'6ma 2: #x. 6.

y;]/y 7 b^(i. #*m:#m$fi2:DNABf#-& ?

&6%. y;i/y7b&t:^»6^)(±. frb, gBDNA'>-^>-y--&&v>t±^tf 7 V-r WSiE;?;kS6$IB7) %£\ 6S

y^i/7 b&(±M%^f|:T'C^7;<^@2:6D#

— 94 — /<> KiM£ 9 £912 Tv h^Ml.2 x 10 17 mol) 8) <9DN &oTv

& 2 a ^ G. L-c^+^^ (:w#-e&6 k#x.6*i&0 y^77 bm[±m#'CK^I##%DNAie'&'^ 7? h7V >

W'L, m#

2<7)z^ trf / u 7 b u-c

«&##&&&h#;l6fLao ^ ffi 1) Gamer, M.M. and Revzin, A. (1989) BioThechniques 7, 346-355. 2 ) Galas, D. and Schmitz, A. (1978) Nucleic Acids Res. 5, 3157-3170. 3) Checovich, W.J., Bolger, RE. and Burke, T. (1995) Nature 375, 254-256. 4) Davis, R.D., Edwards, PR, Watts, H.J., Lowe, C.R, Buckle, P.E., Yeung, D., Kinning, T. and Pollard-Knight, D.B. (1994) Techniques in Protein Chemistry V, 285-292. 5) Griffith, J.D., Makhov, A., Zawel, L. Reinberg, D. (1995) J. Mol. Biol. 246,

576-584. 6) Allison, D.P., Kerper, P.S., Doktycz, M.J., Spain, J.A., Modrich, P., Larimer, F.W., Thundat, T. and Warmach, R.J. (1996) Proc. Natl. Acad. Sci. USA 93,

8826-8829. 7) Kambara, H. and Takahashi, S. (1993) Nature 361, 565 8) Mclndoe, R.A., Hood, L. and Bumgarner, R.E. (1996) Electrophoresis 17,

652-658.

(3) 7 7 h 7 V 7 h &

7 7 h 7° 9 > ]' viti Galas b Schmitz A i. 0 DNA8 d###2#

77 b79 7l'mi±300b@m^'e

-95- ti- IS # B . O (X 64 nt+ 4 2- & o o cr 55 n n> 14: 64 6> s CD p £5 4; # % m 5, M rv 53 # M B * M ° # 64 > a CD «- c? g (^9 64 n> r =Dr 4ir 1 O n 4i @r a o <4 r t# # (V a w Z a O' ft M 11 r m rv a o <4 S 1 S s B tj- <4 4- z: 3^ 4, 64 id- id- MS' £ 64 64 o B y- i—1 E CD S' ° 4j CO n co Or a -d co <4 a+ 00 v;, to V to . 9^ V. "5 B w- a rx H s hd & 3 <4 r i 1 Q a F$ s B & (4 §9 0 a co 64 r^i 64 04 M 94- % n> o r ° <4 a t M ~j H t <4 <4 % - 4- a ts ° , 98 4> <4 4 p ra- ? 64 CT) <4 a ra r> cc r\ id- rv 38 cd 53 <4 vt- a id- V a R r i m /X B -r m r H m- sa vl r» v II ff m a a — B ft. 4 r i a X r o 4 ot" 4? a -r # null is 64 a V. M w \ 8 = a 4 St tH X >* 4' i® > ti- r i a a „ f M 4 r 4 at S? 64 ffl 4 d II I I I I <4 4% <4 n> <4 d a 3 4 V r< 9-9 n> -X V v tJi- W 9s 4- CV S V» a 64 (V M1 W (X I 111 I I II ^ 4 M- 4 rv m- O- 4i 3- 6 O 1^9 M ra- Ml B n r B Ml r> f id; -r rv n # rv 2ff <4 # % ' r^i 1 &* n> 4 # B r n> $ r HI 4 B a <4 FS a S OJc ntF 3 9s 4 M V ip tj- cd 4- a 4- r+ ST O’ 64 -r ~r id. g y P <4 # 64 4 B M 94- H r< ° ti. B V'y ST $& Mi S ^ Jffl r+ 64 a V s # o ’ id- * ^4 # 4 ^ 3% -r 9^ n 5 » 4- 98 air ma^mmLTL^9»r#%

&

#^L^4m (imwW ^y/Ay-^%. >%#*' ww7v bYV > h#9'H» %Jj&t LfSS:L'Cv^v>0 a y y4-myy &*&###?##?) &&&^t:cv^DNA mmt

/ y 7 u -) c# L-c##M^a-9 & c a a ^ %9o hyv>bm#^#(om8h#a(7)iB&m#a^j2:&m&'&:Mr-c tSkik&i&itobti&"?$>*> 1 jo* 3 4 5 6 7 WUK»-t*S5&'l;ov' “C

1) Glas, D. J. and Schmitz, A. :Nucl. Acids Res., 5, 3157-3170 (1878) 2 ) Siebenlist, U. and Gilbert, W. \Proc. Natl. Acad. Sci. USA, 77, 122-126 (1980) 3) Hendrickson, W. and Schleif, R. :Proc. Natl. Acad. Sci. USA, 82, 3129-3133 (1985) 4) Brunelle, A. and Schleif, R. Proc. Natl. Acad. Sci. USA, 84, 6673-6676 (1987) 5) Tullius, T. D., Dombroski, B. A. et al. -.Methods Enzymol., 155, 537-558 (1987) 6) Brenowitz, M., Senear, D. F. et al. Methods Enzymol., 130, 132-181 (1986) 7) Kamachi, Y., Ogawa, E. et al. :J. Virol., 64,4808-4819 (1990)

-97- 8) Church, G. M. and Gilbert, W. :Proc. Natl. Acad. Sci. USA, 81, 1991 (1984) 9) Ephrussi, A., Church, G. M., Tonegawa, S. Gilbert, W. -.Science, 227, 134 (1985) 10) Becker, P. B., Ruppert, S., Schutz, G., -.Cell, 51, 435 (1987)

(4) mux,

$fcmL-cv^.

#/hxi y##e#%nr# DNAf-7xi:fw$%& t Wot, 3-1-3 v LTV'&o BPt),

&a#X.&flX&lK 2(7)Zo^ d21u, o > *^M#0f(7)#$f6feLTe9 2 2:^^r66o $K^W#af 2&d'6. f<7)%m(iDNA"e&&K##mi#%Zi)6@#"e&^ 15^(7) DNA##f%# <7)m#e#)ifuf. $-3- pi-amis

-98- ( 1 )

apt,

$- SDSy ^-c^#L-cam^mc#s# L^m. 32? ^a-c#m$ m&BEfiJ&^&DNABf#-km±-C#^a&&eMTf- h9yf y?7/f -CZ^T

$6C. Agtll^^(7)^yx^®%m77-yt: cDNA#M-^h'$"#ALT7/f ^9 V-^##L. m±C@^/^)^2:|B|#HDNAh(7)^ x m 1) Davis, R.D., Edwards, P.R., Watts, H.J., Lowe, C.R., Buckle, P.E., Yeung, D., Kinning, T. 2 ) Singh, H. et al. (1988) Cell 52, 415-429 3 ) Vinson, C.R. et al. (1988) Genes Dev. 2, 801-806.

(2) 9>-/^70 7K&/7- /^yVyP&

##(DK5B?GAM<0%a%d'"3##MC:9#'m&co, DNA# &m$L (DNA-BD) (AD), ^ LX GAL4S ME V - V - jl 1n? e##LLT, ##^%A2:LTDNA-ma^#Z#m^#mi'&^aL"C9> -/^7 v v ma-mao^sem^^rnfa^h L-cv- - yu v

a m@ v ^^ -a^<^±m

A#c^t'T. AD-^me%^^%mg-±&i#, 6L#mma;yf<9##Bmciiga- -$:& . # c T V #-?-##? Ofgmb $: ## C L T K

(gi3-3)o /\/fyovp&^) mm-e^ao se-e-Be^iemmi-aftt'^cm^^c^t'TDNA-BDL^^ga

Mm:%m$-±tAD-/<-hf-gam-&#a#*&'&&#x.6o tL^fa^^lf, iiem%CDNA-BD2:AD^#^@e2:'e

-&&^LTim#i#CiB'&f &chc&&(7)TV;K-?-#fsf &f5%ibi-&#;%&6"3o

— 99 — ^ Yeast GAL4 Protein (DNA-Bp[

"C UAS ) Promoter} Reporter Gene (lacZ or His3)

Transcription

Idna -bd I 1 UAS ) Promoter Reporter Gene ( /acZ or His3) j"

cDNA Library

GAL4

Reporter Gene (lacZ or His2) Target Sequence

T Transcription

XX y Promoter Reporter Gene (lacZ or His3) |"

A UAL4tC£& ALf. DNA-BD, DNAffnn^^ UAS, GAL4SS&E51J (Upstream Activation Sequence ) B • )sa 70 v bp. Target Sequence i:g-&T Sit J.®-&kcDNA5'f 75'J— —>7LX< & (itya

E3-3 7 > • /'>f 7"V -7

(@3-4)o bSffl m$DNA^e^$s^$ij#iaf- vt ? u z t ^7> 7 $:3- Kf 1993^m%(:2^$- f'Jffl LA Wang

6 7- - yu 7 R#&#cT#;W(:m*&frCV'ao ggG-15) ^^3-1

£MT & Og fi It 1ST- 7 3 - - > JSffl rtf {f-C & < , M t * O fc

—100 — "C

§

&

0

:b$ (Upstream AD; ->7L-C

Activation

HI 7- “ smaa 7 Rb TNFR HIV p53 c-Jun Ras Grb2 Max SNF1 t reverse FAS/APOl

/-K 3-4 C dna UAS (**D<7)) ad

Gag > - • bd

) Sequence

/W j Promoter Promoter V

— DNA-BD;

) two

7"V YtiS6SJ-Ca56@'6'i:cDNA7-i'

9/79U - Transcription X;

Reporter Reporter -101-

$> hybrid -7 OAMDNA**#:

TRAP-1 ibA'KAKSSfLfcTarget Protein Sosl T-anrigen c-Fos Raf MORT1 Cyclophilin MM SNF4

h-r-aa Gene

Gene -^^.^sn^aa 7*0

1

phosphatase, ( ( dna lacZ l&cZ

system

A, v

-

or

or bdz B

ttii

His3 His3

Type x UAS: ) )

ja

-75

Z0»AD/YtWK*L

1

” Protein GAL4S$-£g2?IJ

') — b h'b7.V'i 30R

23 22 20 24 21 25 19 18 17 16

-

ri-"Cv^

16) 0

S iii) Mx.1^ L-cd^, &&maa^-e# coawmmmLv'cas

&Z ad:4?&< &V';K &fr£dfr

£ mm L £ii{5^<7) W#t±liM[6] ic & 0 DNA-S 6 J: ^'S ti-§ 6 o V y'r-'J ■ Vy

x m 1) Wang, M. M. and Reed, R. R.:Nature , 364, 121-126 (1993) 2) Gstaiger, M., Knoephel, L., Georgiev, 0., Schaffner, W., Hovens, C. M. :

:Ad^re, 373, 360-362 (1995) 3) Lehming, N., Thanos, D., Brickman, J. M., Ma,J., Maniatis, T., Ptashne, M.

:Wa^ye, 371,175-179 (1994) 4) Li, J. J. and Herskowitz, I. -.Science, 262,1870-1873 (1993) 5 ) Strubin, M., Newell, J. W., Matthias, P. -.Cell, 80, 497-506 (1995) 6 ) Fields, S. and Song, O. -.Nature, 340, 245-247 (1989) 7) Kato, G. J., Lee, W. M. F., Chen, L., Dang, C. V. -.Genes Dev., 6, 81-92 8) Song, H. Y., Dunbar, J. D., Zang, Y. X., Guo, D., Donner, D. B. :J. Biol. Chem., 270, 3574-3581 (1995) 9) Iwabuchi, K, Li, B., Bartel, P., Fields, S. -.Oncogene, 8, 1693-1696 (1993)

-102- 10) Zervos, A., Gyuris, J., Brent, R. -.Cell, 72, 223-232 (1993) 11) Chardin, P., Camonis, J. H., Gale, N. W., van Aelst, L., Schlessinger, J., Wigler, M. H., Bar-Sagi, D.:Science, 260,1338-1343 (1993) 12) Luban, J., Bossolt, K. L., Franke, E. K., Kalpana, G. V., Goff, S. P. -.Cell, 73, 1067-1078 (1993) 13) Dulfee,T., Becherer, K., Chen, P. L., Yeh, S. H., Yang, Y., Kilbbum, A. E., Lee, W. H., Elledge, S. J. -.Genes Dev., 7, 555-569 (1993) 14) Boldin, M. P., Varfolomeev, E. E., Pancel, Z., Mett, I. L., Camonis, J. H., Wallach, D. :J. 270, 7795-7798 (1995) 15) Vojtek, A., Hollenberg, S., Cooper, J. -.Cell, 74, 205-214 (1993) 16) White, M. A. :Proc. Natl. Acad. Sci. USA, 93, 10001-10003 (1996) 17) Finkel, T. et al. :J. Biol. Chem., 268, 5-8 (1993)

(3) 7 XT'W8=

feim-t§ti\i\ &dnaem* &ummhy-

c<7)i:L-c±mmc#i-&7 /f

6^0 < XTV/f fif# i), 7 7 M13^ 7 7-ca^

A 7 7-y^) 7 7 7-i;^ Lt7 7-yf 4 XT'W

Cft6<07 7-yT(±?

-103- cDNA&m'T7y-yf''f X7WC j:a cDNA? 4 79 V

ttfco

fLW'c,%V##(0##6awa##i@B#di#&c,'CV'ac2:&k'd'&. 7 y-y%f

»r#%^6ao ;kx&mv'&y',fxyw#&im%& MMX* & ho ttz, fflB(7)EWtc^6^-ta 9 >/x°7Mtcs6'^l;$-ti:'CmMt:il^-t"

j:c-cmn&m#&mmta^##^6a c a & G, M13 7 ? -ya tSo w^RdDfiW, ULhUM/<9mm9n-^>y^m^(:ff9^*(:#^ceEfL^-e$)a. #(:, K $-dra2 2: =mr#T&a c ^ ^6, f

ti&o Ztih^7KF4?>*SS

i^^mBLtz, hXrty-4 >V > 7 - v v y X

tlho

x m 1) Fawlkes, D.M., Adams, M.D. Fowler, V.A. and Kay, B.K. (1992) BioTechniques 13, 422-428. 2) Kay, B.K. Adey, N.B., He, Y.S., Manfredi, J.P., Mataragnon, A H. and Fowlkes, D M. (1993) Gene 128, 59-65. 3) Marayama, I.N., Maruyama, H I. and Brenner, S. (1994) Proc. Natl. Acad. &%. D&4 91, 8273-8277. 4) Sternberg, N. and Hoess, R.H. (1995) Proc. Natl. Acad. Sci. USA 92, 1609- 1613. 5) Rosenberg, A, Griffin, K., Studier, F.W., McCormick, M., Berg, J., Novy, R.

-104- and Mierendorf, R. (1996) inNovations 6, 1-11. 6) Robert, A., Samuelson, P., Andreoni, C., Bachi, T., Uhlen, M., Binz, H., Nguyen, T.N. and Stahl, S. (1996) FEES Lett. 390, 327-333.

3-2 "j v 7 - o

h%-#"#-&RNA#o * 7--tfnci o suspense signal sequence 1 Rf DNA^'f(7)#^#iSl##K)C#ISf#fHi' Z>0 > 9

— ^ "j h 'SrS BT T 9 *b X iTSB & t ^ ^ /x—x C ii> *#5iSi9;iii!f:K-F‘

Rehovot T^gf, ISREC (A 4 A) T&2:o?)7 7/f ^ (303kB, Release 48, 1996¥ 10 H) t LXWM$ fiX v>-2> (ftp://expasy.hcuge . ch/databases/epd) 0 EPD t± EMBL T— 9 • 7 i 7° 7 0 database "C,

EPD CM#*#

&#RNA#9;(7-'tniCj:9mm$fi& nx^&o $frT 0 , EMBL^titthML &--S:L&v>0 @3-5 CXn^-XSSl?)—£X

FP Hs c-myc P2+:+S PRI:HSMYCC 1+ 2490; 11148.053 010*2 XX DO Experimental evidence: 4,4#,<2> DO Expression/Regulation: +mitogen RF Cell34:779 EMBOJ2:2375 MCB7:1393 MCB7:2988 n ______)

@3-5 EPDfiWM l), #7A3-5M8T, T^XHl*7A72emx.4rWo m{&mc@i-&##e^tfFP^fT^$D, (//) T#t»&oFP^ii, yo HO# (*7^6-30), 6 EMBLE^J#m^^&@K (ijy A31-55), (* 7^68 -72) yn

—105 — ■ 7')V — 7\ @d£!j<7)M (circular/linear), strand(+/-) #<7)# WLiiFPmz^tft&o EI3-5<7)FPtf(±x iiiiiHs (U), c-myc, fflteV T h t± P2x subset status /5 sindependent, H£q-+hT h <7)M/5ssingle, EMBL division code /$' PRI, Entry name ^HSMYCC, @d£!l <7)1!! ^'linear, strand/5s+, 2490, entry code /5s 11148, homology group number /5s 053, alternative promoter identification code ^'010*2 ’Qfa'b Z. t £:Si~o 7°n^e— DOlf "C (4* (D 0 —o <7) DO If Ir c S 1 & (7), Jk.U length measurement of an RNA product t- X & i> 0 , %%i± mitogen tl&o Transcription Factor Database (TFD) 2) ti, # < (7)(: Mi" 6 Bd^lj'l## ^##L, j: 0#^$ft"CV''6o David Ghosh (ghoshd@rockvax. rockefeller.edu)C<7) i ## i i ##m&#<7)m5-ciw

CLONES (2106 records) DOMAINS (1016 records) FACTORS (523 records) POLYPEPTIDES (1626 records) SITES (2155 records) METHODS (38 records) N_POINTERS (2876 records) REFERENCES (8509 records) X_POINTERS (5757 records)

-106- Cl C X\S3fllf*3t± Release 7.5(1996 ¥ 3 J), ftp://ncbi.nlm.nih.gov/repository/TFD )

CLONESliK^BfcDNAcoK^J

DOMAINS / FACTORS Izmir & -®-ltIE£ x POLYPEPTIDES 5 / ESd^iJ £ x SITESl±K#Bf C1: METHODS(i^#^&e, REFERENCES ii 31 ffl il© 7 7" X o N_POINTERS t X_POINTERS - ^^-7 (Genbank, EMBL, SwissProt, PIR, Genpept, NCBI-Backbone) k TFD t % ') > tirZtzibiztiiFfl SttSo Kk±.

TFD r— ? ^ - x cb SITESt - 7Gv f & £ 0 CtU±, CZ m-7ACf ^<7)fm± 14i@<7)7 i ~)V Kiz-lflE^SBESnTv^o ::tf, SITE_ID (± Sites entry identifier, FAC_NAME SEQ_NAME liSi^J £ /;SxW> hco^BUx NA_SEQ immm. SEQ_TYPE mm£tif, MAIN_REFi±±

^&m*x LOCAT-REFt±, LOCATIONSS<7)#bs ti;, LOCATION limRNA start U METHOD N_PROB (2^<7)

SITES@W^-yA@^J±[:#^I:m6fi6 #

$, REF_N i± reference number ~Qhho TFD Offi'C, ^CHCE& t — 7"(±, RT-co Sit k ## e Ei£ L fc DOMAINS *C* & £» DOMAINS MW £ 0 3-7B "C,

5«3*me TTAA7CCCTT

eeen-ski ppei Science 25* UNA »t»rt »• LOCATION -1S62/-15S3

9 DOMAIN ID D00293 tAC _HAME C-Myc STPVC_CLAS heli<-lo»p-helix

AA START 1*6.00

AA^SEQ VKRRTmtVLtRORRNELKRSrrAt.RDQIPELENNEKAPIcVVTLXKATAYILSVQAEEOXL

rUVC_CLAS dimeritition CCWENTS Hyc ainvi let ity region

MAIM_REf Genes Dev « 167-79 <19901 R£r_N 902*972*

a. sttes-t-r^ns.mm l ft=W l B. DOMAINS'?"- cti&miZnmmrnto

0 3-6 TFD

-107- DOMAIN_ID t± Domains entry identifier, strnc_clas $ A: lit-f — 7^#, DOMAIN_NUM t±amino terminus IZft-fZ K* T AA_START l± 9 >/ & starting position, AA_END t± stop position, AA_SEQ (± 7 < / gtgd^J, FUNC_CLAS &«

1993 Mizushima ^ i±TFD £ tC t:!£¥0¥ 7'- 7 ^ - X (TFDB)

TRANSFAC t±, d cis-acting SW DNA> h t trans-acting H ifH-IM] f TRANSFAC (± 1988¥^^ep@iJt/ 5) a L"CXf- b L, 1990¥¥m¥lb5n^ (http://transfac.gbf- braunschweig.de) o TRANSFAC7* — 7(i±

Release 3.0 1. SITE (4299 entries) 2. FACTOR (1763 entries) 3. CELL (816 entries) 4. CLASS (27 entries)

Identifier : IIS$PK_4 Key: TRANSFAC flag : R

Sequence

Binding Factor: AIM

References: 111 Ilia, 121 I40aa

l-MBI. II): ItSKNKVIimS:)

:ry Dale: 20.06.1990 : R. Wlngendcr Updated by: T. Helm HTOB ENTRY)

Factors

F.nlry No. : 0034 Synonyms: AP4; (JT-lIBalpha/bela ? References Factor : AP-4

Cell specificity:

4ft kl)a rSDS) Class: bill,If-ZIP iral features: 2 let zippers: 1.4. 1.2E1.2; Q. Pi

Interactions: AP-4; (I.cu ZIP truncated variants:) EI2

Functional properties: functional interaction with AP I

It la. I40aa. xfi2, Matrix F.MBI./SWISSPROIVIMR:

Added by : E. Wingvnder Undated by: USE

03-8 TRANSFAC r-^-X

-108- 5. MATRIX (246 entries) B3-8C. SITES &FACTORS####M&#f*'o SITES###;:#. h I'). ^%^c#A##. -g-#^#. gtSi. Rtf^H C#T-7;i/#(I h Af#^/f h#. MS i"£ EMBUGenBanktmC-Sl $ tvC v S (Comment 11)» FACTORS#. Ztib

SITES i> FACTORS &mHV X b^#^^ CELLS#

#"4-4 CLASS#^Bf^7A;:o^-c#^###$r Gtto MATRIX# (^i:%ftm») $5^BT-<7)*§^-tM h izmth nuleotide distribution matrix £ 'j- A 6 c TRANSFAC #GBF#7n y^ 9 b#-g|VZihhiiK Z tit Mil# fc* European Comission# TRAD AT ~fxiVs.it H±, TRANSFAC 6 r"- ? ^ - X ###1:

M##%DNAK^I#^@^mr7-;i/#m%$'BmL"Cv^o

#####$#hL-c. ho

(2)

mfKfW#y<^l/-ya>#^aLT#. l77-y#M%mm%. ya9y37/<^# tryptophan operon # $'] # # ri^lfhhh 9' 0

17 7 - y #7:#mm%f & virus ? 6 & d*. f ###E »#%## ^ #;&&##

^ac(ri#m#. #m dysis) &##%. ±mmc«B%Ld:i

t7^ac#i§m (lysogeny) ±#m(:#%L^17 7-y#%A##.

77-yd^#±mm;:#%L"c#@i-&,:a^m#fi77-y#migm@#y < ^ F-y a 7CUBLT#MeyersCi^^tirW^vE, 10)= C###T#. 7°v?? y a

—109 — > • 3 McAdams t Shapiro (±> itfe

jU't'O, M^-Z-'s

5 JL !/-'> 3 > 2: LT##L, -ZrcoZamZ i> izx&fflftt LX, Shimada b Hagiya bcojLW- h-£z>iMz, h &o -%(=, M &if-then-else ;v-;v*et± , roa_h«;^;vi5i^l-cUi\ ES

^y ; al/-y 3 > j^{ii3£& V'ijl'a ii%%r t'0 17r-’/®'/5 J- V — i/ 3 > {= £> V-1 "C & > f C"C, if-then-else^~* X$li&1rS:i©(D*4^8&&(=#L XV^;v

2: LTV^O y 3 7i/3 >//< f #y;i/-7T6&o =1 ay 9 y/*9Kl£embryo @2£* LT& D. c<7)^Ki##e^A/'c^(ofTyy' - ? 3 C(Ot^y 3 7$"6h(: LT, C(7)7^:<7)% embryo l±14#(7)t^ y 3 71=^^^ft, 14#<0#

Meinhardt(i^T;i/-^ - ?7/ -c c ^ u ^ ^. Reimtz t Sharp(i> y^^Mt^T )V-)V • ?

& £ d J — 9^ simulated annealing (= £ o T^*Tv^ 0 -€"<7) i|g#aL"C. «*-CiSV'£ t ^»i:/TLf;14)o Arita t Hagiya $> t±. Simfly (Simfly-l 15> b Simfly-2^')

-110- ^h#x.x ^ LO#5#m&? >^?Xffico±£frfj: FL#t'#J ^r^^ik^LtZo Simfly i±, embryo%###CBc^^ < a. L-

if amount (bed) > 3 & amount (nos) < 1 then create (hb) ±/

X - 9 iLMLfco L<7)L§ W6

^VXAemv^T##f&o Simny-g-Cd, f^-vX- ^^/^gX^-X&^X;!/-

Karp(±x tryptophan operonLXx £nISi^--X ^fflvXcTEtttoX s a- y~y a y*'il^tzll)o tm> C(7)y < jLP-^eGENSIMh^#^:o Cftli. 7 L- tryptophan {z§|#LTi'& 6M#&h'&f7yjL?l'2:L'CmmL2:W&'<-x (CKB) L#Xr(DK^$:Xa4:x,

aLTm%L^Bm^-x (PKB) &6&&0 PKB(ilb#KB(D-'jR^7xaa6(:# (BI3-9)c GENSIMlix JELt'y t/yi

Mavrovouniotis (DT ivziv X-AbfiZ> 0 #l±x BSlxS(zfrfrfo&WX ^6GibbsLx Gibbs J:^Kfo

-111- Diescalation.Processes Mutually.Exclusive.Binding Naked.Binding.Pres R. Ef f icient .MutEx^Bifsciinp R. Hut Ex. Binding^.*'"

Fill.BindinsKtfiVes Rel eg.»€T&i-rfding .Sites ... Antbfani late-Synthetase .Assembly lemblye^^ f Binding.Processes Trp-Synthetase.Assembly } Repressor.Binds.Operator V Ribosome. x X Binds. RBS ein. in MukableSite.Binds.MutableObj Asite. Binds . 8 Mut^fcleSite.Binds.Obj Muta&leSite.Binds.MutableSite MutabfpObj.Binds.MutableSite MutableObjy^inds.Obji ''V' Polymerase.Binds.Promoter / \ Trp-Repressor.Binds.Operator Mutable^Cj.Binds.Muta^leObj

Generic.Protease.Catalysis Processes PA-Isomerase.Catalysis ’ RNAse - Catalysis Mutation.Checkl *"""**• Trp-Synthetase. Alpha .Catalysis Trp-Synthetase.Beta.Catalysis iTrp-Synthetase.Catalysis ^nthranilate-Synthetase . 1 Mutation.Check2^ Catalysis , Enzyme.Binds.Inhibitor .Binds.SmallMoieiuleC^ Repressor.Binds.Cofactor Naked.Sma11Molecule.Check RNA.Binds.SmalIMolecule5 tRNA.Charging

trp.tRNA.Charging

/ Leaky.Transcription.Termination Transcription.Elongation Transcription.Processes , Transcription.inrti.tion • Transcription.Termination

, Trans1 at ion.Elongation Translation.Processes *-• Trans 1 at ion.Initiation -Translation.Termination ytj-tz ■ ?7xii, 7xciygr*eiftLrv'5 c %4T0T#^7'O tT. • 4 >7. ? >7.l±, -etl-bmtz ?7*IZ @3-9 Karp (Oynta&i

SS/utx^olfflCU) , Z Z

Karp (oevcm 6Z 9 (: j: & y < ih-va me?

£ ffi

1) Buchner, P. and Trifonov, E. N. : Nucleic Acids Res., 14, 10009 ~ 10026

(1986) 2 ) Ghosh, D. -.Nucleic Acids Res., 18, 1749 ~ 1756 (1990)

3) Ghosh, D. -.Nucleic Acids Res., 20, 2091 ~ 2093 (1992)

-112- 4) Ozaki, T. et al.: Genome Informatics Workshop 1996, 218 — 219 (1996) 5 ) Wingender, E. -.Nucleic Acids Res., 16,1879 — 1902 (1988) 6) Faisst, S. and Meyer, S. -.Nucleic Acids Res., 20, 3 — 26 (1992) 7) Dhawale and Lande -.Nucleic Acids Res., 21, 5537 — 5546 (1994) 8) Wingender, E. :J. Biotechnol., 35, 273 — 280 (1994) 9 ) mm a: 37,941 - 945 (1996 ) 10) Meyers, S. -.Nucleic Acids Res., 12, 1 — 9 (1984) 11) McAdams, H. H. and Shapiro, L. -.Science, 269, 650 — 656 (1995) 12) Shimada, T. et al. -.International Journal of Artificial Intelligence Tools, 4,

511 - 524 (1995) 13) Meinhardt, H. -.Development, 104 Supplement, 95 — 110 (1988) 14) Reinitz, J. and Sharp, D. “Mechanism of Eve Stripe Formation ”, Technical Report LA-UR-94-1915, Los Alamos National Laboratory (1994) 15) Arita, M. et al.:Genome Informatics Workshop 1994, 80 — 89 (1994) 16) Arita, M.: Genome Informatics Workshop 1995, 29 — 38 (1995) 17) Karp, P. D. “Artficial Intelligence and Molecular Biology ”, pp. 289 —324, The MIT Press (1993) 18) Mavrovouniotis, M. L. “Artificial Intelligence and Molecular Biology ”, pp.325 - 364, The MIT Press (1993)

4.

4-1 *

t L-CV'So f 5 2 a i: j: N J Aft

& & a t' 9 & o t'-c

V1 -?>o

-113- 4-1 -1 NMR 1C «£ 5 DNA, RNA, ##^^(7)&##M<7)NMR(Nuclear Magnetic Resonance)# C <£ 1980 ¥ ft U Wiirich b «-T< ^^ y la-a£ScJ) i: Distance Geometry#^ 3, Simulated Annealing A hxogf#$f#

1990^ftCA&t, Mf, "C, =H)

(3^7C#ia) ^>XX@(7)AL#:

$^NMR#-e(i, #2Mf '=N, "C/H) #G#&<7)-e, <%#%## (mmm#) Kf&&k'?)f##;M#&fi&c k 4)±# O-C6&0 >//

?>/b9;kK/

Wc ltv ^c: 5 • ®^VA:;i/-eoS)aX)BfftillE ’e

i-&Z kli'&m-C'&Zc MC, >'S?Wk DNAW^#(complex)<7) 3'A &, #%/sX&®$r&iitorffik&i) . -E-fU:$!im$#"CNMR#

-114- ccrti, %Mx

^#(7)NMR&(: NMR&-ea'(7)j:9 frlzirho Wurich3~6) 6>(±> '> a ^ y a ^ Antenapedia <7) * * jr V J 4 >5r&H<7) 60m

<7)t:\ Zti^co 1:1 complextrou-C 15N £x> V '7 ^ v>,

#^(7)2^, S^TcNMR^m&L. C(7)#emti, 4# ##&fb# v 7 b 6 i: „ free t complex <7)3?fi

NOE(7)X^->

ti&Vo fc>~f /Hr^ V 7 ^ X II

^BL-Cv^ck^m*6fi^o ##faDNAm (14#*) 6BmDNA<7)#m^hc Tj3 0, 6^:#

NMR 'i£Rl/ distance geometry StSi£ £ ffl v> X #$f Lfco H-NS 9 's R<7) C )K^<7) 47?$S<7) ^ t:(ix 11@<7) a-helixS.?/1 1@

#H-NS ^ DNAh^-a-L-cv^c: Schumachu £>8) ti> Lac V 7°V -y- 1)"—<7)^ot:'dfe-£79 V > • })XU ■y-f—9 >/S?K (PuR)<7) DNA In a" IwOV't #% =tr fr v \ C(7)^ >/<^%<7)N t± helix-turn-helix <7)

rigid&#m&2:cTj3lX C^m^DNAh^LtTV'&Ck^^^^CL^o

DNA<7)yat-^-#^(:Bg^|##A(M:i|g4'f6i6^afe&6c- Myb trout: NMR & t?ffl& L tz „ c-Myb ti 521@(7)T 5 3o<7)|| I) SLE

^|J(R1, R2, R3)e2:^t:^l). DNA@8&#AACNGm=]>t>'9-x@d#l&mmi-a. #

6RR1R2R3 5:R2R3ti^^'-E-^DNA complex ;$$M S fi> ^<7)2^, 3^tcNMR nmmfrtifzo m&nmrx ^? y )v

Rl<7)%mu\ DNA^4-#t:±#»^#i:2iU3:$^^o A*Z:Lt:, ri, R2, e3«3>7m-'>3 >iii

—115 — mutant<£> NMR& ¥

<7)#%##, D. R. Kearns l±. ±^<-c<7)/<^TV7(:Ev^$^#(:^^6&gaK Transcription Factor 1 (TF1)<7) NMR 7^7 b %fto tza C.<7)TF1 (±> 22 KDa <7)homodimeric &gaStt\ 5’-(hydroxymethyl)-2 ’-deoxyuridine (hmUra)^#± DNAtcM'n't^o C<7)ga#(i, b-ribbon motifflU&fc'g ’A'Cis 0 > ££BJ$L±ga 3#(7)'\V7?7#im&&3'ATV'&o C(7)9t,2#(Rl, R2)(ia-^V v^X tp l#(R3)lia-/\U 7?7m?&t#mT6&oDNAC*S<V'&<7)kL C<7)a-^U v v ^^m(R3)@B^-e&6Ch^6^c^o $ TFltix hfllitSr b *K & < folding LX&*), Zcop-s- R3 b&< fBEM LX^&Zb fitz0

M<7)V'< o^?Cii5|, NMR&C Z & DNA binding S6K^> j; 0 (zSlro*C§ b 0 CL<7),d?(± ga® 7Ki§OTt-&V'Tfree<7)ga aaDNACiiaeL^:SaR(D^##m(i. 6$0±#<^t'cTv^v^-9'e&6o a m#/<^®<7)^±-7 k L"C(±, 4#m<7)^±-7^#$$ftT^&^,i5)^ 0/\ ij '7 7 x • • A'j -;n, (D Cys • His • Z 7 -f > —

<7)666. ±<7)4# mw#<7)f ±-7m&v'%$%aM###&6o 4*#> C^V'-CNMR&C j: I). v^v^^DNA^gag^DNAgg^j;

-116- V Cys-His-ZnY-f > A' r\

d —F/Y

C6i: DNA mS-oEtire&S. a''!)y7 ^■ttn®-C-SL, SS47 ; / jmSBr-TSL-fc (C: y^x-T-yj^x-fy, y, H: twsy, F: 7j=*75-7 y, L: = 4 y y, Y: f- = y v).

El4-1 DNA$S&f >/

-E-ftaatC. (Mw-30K) cnt:^L> NMR^(o#m-e&6, ^#gijcm#jL, #a&u3#;m&7 7n-f

-use. i5N. 2H; %<

i LT. @^NMRa#Wjmt#x.^fi'&'C669o x m 1) Kline, A. D., Braun, W., Wuhrich, K. J. Mol. Biol. 189, 377-382 (1986) 2 ) Kay, L. E., Tochia, D. A. and Bax, A. Biochemistry 28, 8972-8979 (1989) 3) Qian, Y. Q., Billeter,M., Otting, G., Muller, M., Gehring, W. J. and Wurich,

-117- K. Cell, 59, 573-580 (1989) 4 Billeter, M., Qian, Y. Q., Otting, G., Miller, M., Gehring, W. J. and Wtirich, K. Mol, Biol, 214, 183-197 (1990) 5) Otting, G., Qian, Y. Q., Billeter, M., Miller, M., Affolter, M., Gehring, W. J. and Wiihrich, K. EMBO.J., 9, 3085-3092 (1990) 6) Otting, G. and Wiihrich, K. Quart. Rev. Biophys., 23, 39-96 (1990) 7) Sindo, H., Iwaki, H., Ieda, R., Kurumizaka, H., Kuboniwa, H. FEBS Letters, 360, 125-131 (1991) 8) Schumacher, M. A., Macdonald, J. R., Bjorkman, J., Mowbray, S. L , Brennan, R. G. J. Biol. Chem., 268 (1993) 9) Ogata, K., Hojo, H., Aimoto, S., Nakai, T., Nakamura, H., Sarai, A., Ishi, S. and Nishimura, Y. Proc. Natl. Accd. Sci. USA,. 89, 6428-6432 (1992) 10) Ogata, K, Morikawa, S., Nakamura, H., Sekikawa, A, Inoue, T., Kanai, H., Sarai, A., Ishi, S. and Nishimura, Y. Cell, 79, 639-648 (1994) 11) Ogata, K., Morikawa, S., Nakamura, H., Hojo, H., Yoshimura, S., Zhang, R., Aimoto, S., Ametani, Y., Hirata, Z., Sarai, A., Ishi, S. and Y. Nishimura, Structural Biology, 2, 309-320 (1995) 12) Reisman, J. M., Hsu, V. L., Encontre, I. J., Lecou, C., Sayre, M. H., Kearns, D. R. and Parello, J. Eur. J. Biochem., 213, 865-873 (1993) 13) Jia, X., Reisman, J. M., Hsu, V. L., Geiduschek, E. P., Parello, J. and Kearns, D. R. Biochemistry, 33, 8842-8852 (1994) 14) Struhl, K, Trends Biochem. Sci., 14, 137 (1989) 15) Lee, M. S., Gippert, G. P., Soman, K. V., Case, D. A. and Wright, P. E. Science, 245, 635(1989)

4-1-2 x a *

6iA#:#m7'-^(i4000m_L (1996^7H)

^ aczc

-118- H-CafeSo tzbx. ti\ Z tX'Ul1

(i)

<##?#&

#co#^W'o'Ti'&#^i;L Mt&"C 1 n B&Pm'Mffi %a/mma#;t6frCV'&o ##ya^9 A[i, 4fUX(D#%%# (Biotechnology and Biological Sciences Research council, BBSRC) togdW'f" & CCP4/$$fiE!£l$t-k

m^$tL"cv^c mia-rii. acj:c'CiK#Yi:(:^L^#^6&&o at. ^m<7)#A6#%T6&o i^a$

NMR cz &a a & c# < 7 > x

6^0 C:fi$-C(:l±#;t6fiTV'&d',3&7

1992^(:^af j%f Xf <^)TATA-box >/

-119- /)Xb7>K^ c^^-cv'&o ^(D#mmmmcDNAa^L, ^DNA&%x.. ^mm^RNA^U^^-if^a^yzL-v hh##%cav'h#x.g)^TV' ha

(3) NMR C J: 2: 6 C, c a C4"# t±v4:%#$fi"Cv^&c zti£-c

####$%.

^BU^VT(±> Ai * ;v *?-W)m-m%P)r U*> it & Mhtfi] in mm (Photon Factory)

^#c. (spnng- 8> o. mmu

(0N#R^ff^a frc&o -E-Kczc-c.

4-1-3 3 D-1 D77-f>^>h

&. 3 D-1D 7 ? / > ^ y n±, ?aw** 2: ^ & 7 < /ma#u

-120- v ^ >/

(1) T^d'VXA Eigsenberg 6 ^ #)&^m^(D7 < (2 ^ ##, C#cf^T^mL^E#^(:gEmL^ya7^-;]/7/f79U^##g f&o $^, 7 (±, -E-^^r# (a?*-^^ >y&) em^L^o ^au-c, Tsvae^Toammac. g UarlSIft) <0#%;MW6fiTV'&o folding recongnition 1& IFFifiL'CV'-So

(2) 9 >/<7(Meeting on Critical Assessment of Techniques for Protein Structure Prediction (CASP-1), 7'>nv, * 0 ~7 t )V — 7, 1994 ; S 2 @ B (± 1996¥ 12 B (CHIB) TMi, >rX b (DWmtfftfthnx

(33#), %## (66#), -E-LTAbInitio&, (29 #) -E-(D^,

9 4>;C/1''C#2::WCW:&<,

-121- ##&&&?>/

(3)

3 D-l D794 7<

2(7)Z9t:, 3D-iDmu\ #j& 7 1J-XJ»<7)fe&Jf‘m*?'fr*&& ’t2>

#c. f%&?>/

-jl-7 71/^. -7 l'7“^W^f;^l^/;fMS$ti-C^I>u)o ##%

^ m 1) Branden, C. & Tooze, J. (1991) Introduction to Protein Structure. Garland Publishing, Inc. New York. 2) MMA r#mdE##C^G& (1996) &##@ 36,5. pp 211-215. 3) (1996 ) 36,5. pp 211-215.

4) im## j (1996 ) 36,5. pp 226- 231.Lemer, C.M.R, Rooman,M.J. & Wodak, J.S. (1995) Proteins, 23, 337-355.

s) awm# rDNA^ty-7<7)#imammK^<7)mmj (1993 ) 33,3. pp 136-141. 6) Nikolov, D.B., Hu, S.H, Lin, J., Gasch, A., Hoffmann, A., Horikoshi, M., Chua, N. H., Roeder, R.G. & Burley, S.K. (1992) Nature, 356, 505-512.

-122- 7) Lemer, C.M.-R., Rooman, M.J., & Wodak, S.J. (1995) Protein Structure Prediction by Threading Methods: Evaluation of Current Techniques. Proteins.Structure, Function and Genetics 23, 337-355. 8) Jones, D.T., TAylor, W.R., & Thornton, J.M. (1992) A New Aproach to Protein Fold Recognition. Nature, 358, 86-89. 9 ) Bowie, J.U., Luthy, R., & Eisenberg, D. (1991) A Method to Identify Protein Sequences That Fold into a Known Three-dimensional Structure.Science, 253,164-170.

10) ####, tf 7 y u v~

4-2 f >/

4-2-1 (1) mm

(2)

(S.ceremsWl: # t'T tL^: L C: ft GCgE#6ft&

-CV'&y 3 3 b 7 >y##:bfrcv'&2,3)° ##ca

-cm#-, (134-3)^0 Ltz Embryonic Stem Cell (ES M»)<7)W]£6)

-123- ojr^vtx&mKT^&o fLTf^^ox^Ammw^-yvT/f EsmJ&frt>ft'ikLtzi>coxfoiu£, jstsc tiz£t)y-y?T'yb'?'yzt Lx$ab

A

ammm?

B B&T^Siifg?

a ; smm^zHXSiK&zsmttsmm) b : (#xs)

m 4-2 y-'fvTiy yomm

iie^-y-yr-# >4,&)fo/zESElSj'n->6 E*7^X%±W«B:aXT&.

Implantation 5*

ffiS)l*6t$ESEiS6»iDEgSA5?S IS >) So/i

I

f^xomanmWEswam* tt<£3 e>i:-^T-nC'lBl*;|9±fySHt-6'b-S: 0 *toK#"70x^-c#3.

04-3 ESfflE5rfJfflL/-c-7'>X(7)it^^-y. 7T^ >y (3) mma - mm

c-na&a.

6w#mT6&c stage specific CfL^r#9 B%"C3>y/f y af^^-y-yy/f >

L^L#^E. ^y- > ^-y V f /f >ye B#C C t ^m#76o 4-2-2 7>f t>%=f V339 L/#f K)$ (i) mm #m<7) DNA 2 % U RNA (mRNA, ^7 /f ;i/ % y / A RNA) C# L*#% ^K^!l gr #7 & y v ^ i%yy #Mm#y##m&MM7ac:a$-##2:7

M###aL-c a) K^a^-rom# (u) RNAxyy/fy^y^m# m mRNA(7)l<^%fk (iv) (04-4). MM## (: K7 & L v>o

7>?~ir>*;*•»; d*^U7f- K

mRNA

mRNA

RNaseH Xt'ls7--t:i:J:&5}& 5'CaH AUG| I IAAAAA

Q> Ribosome

ytfy-AfWiS

m 4-4 T > 7-fe > x 7 V a'x >? V * y K iz a z> itisyMM##

-125- (2) ms ($IJ#PM) CCV'TU S< > 1960$@d'67 < ^ y >%####, § ^ CH 1970^(7)77 K(7)%»##. igg^^c

Weintraub 6C j: <7)WJ^±h cm@(±7>f-t yyty%fV =fj% ^ >f f- ^tzo cm##C#DNA^^##(:%c^c t

(3)

TV'&o I)c< %ck#:am:#u,'o (7)*em]#ji-3av^m^<7)m#(:i±^v'@M%^ #g|6]'ae^oSaa<7)|i%Hi±v><'O75><7)A:>^-^-^||W[)SA'ev^0

"C v^o C. m j: 9 (z#^- ^^#(1 joV^fx in vitro > in vivo Wffii-iBV'TJ^fflfijF^L/^ffi

( 4 ) pwa - gR@ LT(±#6*#^^&"e6&h^x.&o cm ca(±#%myv yXTArn^cm^&^^f v^#A6 fi&o L^L.

^ v^ VS^r5l§$3C LTv&c ttz'$\

-126- tiUto mRNA XT’? 4 V >X\ Vt-/<-#e#AL"C, 7>f-fe>

X $$!] <7) optimization £ HI & £ —WiVcf b-it'^inbho X ffi I) Rothstein, R. -.Methods Enzymol., 194, 281-301 (1991) 2 ) Kaiser, A. et al. :Proc. Natl. Acad. Sci. USA, 87,1686-1690 (1990) 3) Zwaal, R. R. et al. :Proc. Natl. Acad. Sci. USA, 90, 7431-7435 (1993) 4 ) Miao, Z. H. et al. -.Plant J. 7, 359-365 (1995) 5 ) , Tx- > ? -y ? T 4 > fj, (1995) 6) Capecchi, N. R. et al. -.Science, 244, 1288-1292 (1989) 7) Gu, H. etal. -.Science, 265,103-106 (1994) 8) Murray, J. A. H. “Antisense RNA and DNA” , Modem Cell Bioligy 11 Wiley-Liss Inc. (1992) 9) Uhlman, E. and Reyman, A. -.Chemical Reviews, 90, 543-584 (1990) 10) Zamecnik, P. C. and Stephenson, M. L. :Proc. Natl. Acad. Sci. USA, 75, 280- 284 (1978) II) Izant, J. G. and Weintraub, H. -.Cell, 36, 1007-1015 (1984) 12) Crooke, S. T. “Therapeutic Applications of Oligonucleotides ” , Springer Verlag, R. G. Landes Company (1995) 13) #### 48,180 (1990) 14) Morishita, R., Gibbons, G. H., Ellison, K. E., Nakajima, M., Zhang, L., Kaneda, Y., Ogihara, T., Dzau, V. J. :Proc. Natl. Acad. Sci. USA, 90, 8474- 8478 (1993)

4-2-3 75V#6(DUimi'J

b BLAST, FASTA, Smith-Waterman & & t'U * 0 BLASTS. *9"/ 0^9 AT6 0.

-127- FASTA(±Pearson(Univ of Virginia)

$ tifz'/tt ?’ 7 A T\ yx y '> jl IE & v x ;!i Jilt £ fr & V1 & /)' (b Smith-Waterman

m%Srnith-

^ 9 2 a amm-c #^ vx^ x ^ tr c j; o & 2 a c a

f & 2 t ^"C # & g#vx#iR#^^%&##.# Ld'Eod' G $ 6 [:#

)17?'i *y b <0#&C(±aa?lJA#W79 /f / 7 b &^&bMPJxtt)'&7 y 4 * 7 b t LXft&frti&2tri^x0 m^OMtLX (± EMBL *C MS fitz CLUSTALW

###Mk LT(± NCBI MACAW » VWb h „ -ft, 2 9 L£v;Vft7;VT9^ 7 7b VSlCff^o t, #%<7)7 7 < U -^###

2fu:^f3##^e»9^^##vx#^h^c-cvx&« # i,tf\£a.y — ?£ D , Sift 1,000 &£ ^ (7)^ft-7^Sm5n-Cvx^0 PROSITE7(7)iiWAe^fflvx-c^f^bnrvx^,^ cft&f i-

7 9 7 7 7 bOft(7*n -7 9)£ lx o St* IBtoLit LTM $ ilftt'- 7^-7^' Henikoff (Fred Hutchinson Cancer Research Center) b U j: h BLOCKS Xh D , PROSITE t ^Xi < ffltx f,ft*o

2u/;^Dy-ti, nmmn< dnrnt m

(2frG$-tft-7(:^*62a6'r#&^%#B<7)#^*(i^vx),

^-<7)[p#Gl:j:&PSORT(±9yft;7 ?##&

-128- &&. K±©yn/7A©H« BLAST, FASTA, ClustalW, MOTIF(PROSITE), PSORTliyv A$v b

X+t—t£x) (7) WWW -if—/ < (http://www.genome . ad.jp) (04-5) jW0TS£"C%£o

Welcome to the GenomeNet WWW Server!

'.nsatvtt tsr Ch&rr-ifcal Research, Kyoto University Hiuaar.Ger.oir.c.Cy.ter. institute af Mcocal Sdeaca, '-'xvjrsltv«(Tckyc

Genome t{p$

QAbout Genome.Net ... rWhat’s New | DB Updates !

P Genome Databasesln Japan PKEGG: Kvoto Encyclopedia of Genes and Genomes

P DBGET Integrated Database Retrieval System

C Sequence Interpretation Tools

P IDEAS Interface to DBGET/KEGG/BLAST/FASTA

P Anonymous FTP of the GenomeN et O Molecular Biology WWW Servers In the World

04-5

4-2-4 $ >j^wsm9- f t X >/x°xK<7)T 5 7 EEXiR ^e-7 - 7 & X, ^7 ® f-X2:^%^L^7*-X/<-XyXTAhL"ni, TV'S DBget yxr A (04-6) NIH <7) NCBIO Entrez '>XrA (04-7) (http://www.ncbi.nlm.nih . gov) & £'fr£>&0 C.ft6<7)yXTA"C(i, X 7/

XyXTA"C(i^>/

/;yxf A fc Lt KEGG http://www.genome.ad.jp ) /)$&-2>o ClWyXf AtiL X

-129- | —T ~ ftetscup*: Enfrct Broicser ' " ' j ■■ I : : j PN j EcBt] View [ Go | flookmarts | Optbw|^#w»y|^WMow j j Help j ->| ^ | a Mi if a B^a.l t.r..„| thw. t’Vif Ct»> | him. XT

The NCB1 WWW Browser Getting Help

• Howto search using WWW Entrez • How to create WWW link; to Entrez • General Information about the WWW Entrez Databases • information about the WWW3-D Structure? Database fMME'Bl DBGET Database Links • Obtaining the Network version of Entrez Search WWW Entrez

• Search the experimental PubMed system (full MEDLINE! • Search the molecular biology subset of MEDLINE • Search the protein database • Searchthemideotide database « Search the 3-D structures database • Search the, genomes database • find MEDLI NF articles that match a given text • Brow se NCBl’s taxonomy ■ Retrieval of larg e data-sets with Batch Entrez

Version 3.8 Last Update 8/15/96

The Entrez Browse r, is provided by the National Center for Biotechnolo gy Information. NCB1 also builds, maintains and distributes theOenBanh Sequence Database. Sequence submissions ran be made to GenBank using the WWW tool Banklt.

Glicifon the'Clatab as e-name of yowfchoice to-mvokelifcpBGET retrieval sy Cxmncnrt !. t S • . *" *-■ Ss ,» " ■' f .. ' Credits: Braidor. Bry/zwski: icnattuai Epstein . P D BRel ease*Info fPBOET Help tGenorochtei XKEGQ Homaj IDEAS

E4-7

WLMMBL&itT*- ^f-7^1

SMART (7 y A T > b iiyv A % y b c£> Anonymous FTP^'^A#"®!) C.60v

LTliPACADE At

FACADE-C(±, #j£±SiattSrfiJffl#^

2ft6(±PACADEhM#C^ naE-e^^o

CtL6<7)cf y 9 — % V y

iI t) "Cab 6 o

—130 — [SCOP] http://scop.mrc-lmb.cam.ac.uk [CATH] http://www.biochem.ucl.ac.uk/bsm/cath [FSSP] http://wwrw.embl-heidelberg.de/srs/srsc7-infollFSSP

-131- 3S ^(DMkbJjsWi im

1. 7 s 7 - ? [WS • B69]

67)4:^&o o&&d\

#aaizf(o#mya7 < a»6

(#) &[f?>/

[StE<7)T7 Yy'i'y]

1. FDD JS3 Molecular Indexing Efe & t\ i&l5:f-60|l?l7pn 7 j )V% >—'/ tf f ttz, K

(mRNA) <7>£KFS£?n-->?\ 'y~ 7 jzy y y 7% ti:, y-^xy^f-

2. ±EyXf hfSo

^i/f-^^-xtfciit^o 3. ±Ey'-??)mm&6 2:(:. mefM(7)m#$.7l'7-^^##KfS77l'm%e

4. cfi6(T)R#^mx.fi:byyA(:7^-K/<7^L"c. CJtSffl-f So

-132- [nMEeDSMUi] r#+@

%% y a ^ & <% a -c ^<6-c^'y->y^R(Dmsj±!R[:ZcT#m^%mya7 7/ v vt

■— yf t X 7° V 4 '/£/5slt 4> jitL X V' h o &* Jltf Ki±Xu-WL Sft£ig-Cj&» e>-C*v>t 4:

XTvy^yyAK3F!H##e#mf&C:k'eyXTA^&mmL, @# wftFDDy%TAe##f

B#fo &#ftyXTA-Cia

1 ) &'<> K©T>f T*>T 4 t4 i 7 i -7-

x fdd ##A#&

^2 &o 2(7)^#ftFDD y%TA(:Zc"C»eyv 7

^x'?j-4

C(Da 7^#%<7)^|n|%(ayXTA2:

####. ^###.

-133- J5Cb%<7)Bfc6.

##cDNAj3Zi7y7/f 7-^mnf &<%*

1) FDD KmL?zMm<7)ffl%&m!BiZ^'blzKm&0 #C, 1)

7 7-#(0K*&o 2) %ff(7)FDD y%TA-C(ie^7'7 A

<7)#mcMLT(±,

1) 5Bfrntf> b i), mm^#aR#<^@<7)A^cDNA^

< a%^m7;i/n-r-

#. >ye#62h$r#^B

iE^tti%(7DSE #1f(DFDD yXf A'Cli%mmDNAy-^7'4-±'C(7)%7yy^m^N(#$:#ML,

-@<7)wite£2.5b#miE-eirorv^o -m

-134- m/fV7 M7)iHSg ±%y%f K^m#%c#m f&7 7 b7jc7$-m#&f60 %ffFDD7^f A(:#f&M#(DV7 b^^#L"C^^

f ? yc^v'TAf-^ a &o

mmm ^y%TA(±A:yy %##<%##&e-5 tS^Wm't^o

^nyx? b<7>gs<7)ji^<7)^:*t-, 2:

,541 & a 1:4-## L < mmf & T&6 7 c^A-c # &#&&&## ^ a at ^,o

t#fe] 1) ^ZyXTAU, mfKf<7)%m7'n7/f

2) #x (#m)

###\ ft#^(03c^c j:^c$Mjg;< #m?& &=,

-135- 3) a L -c # ^ c 6 2 a .

2.

j: 2 k (±r# ^V'o

HkM&mzX&y; J*fem7,-?'<-z'£&zbK3: 0. <£

L-ni. • DNA ga^ij fr t> OiKS^SitO^ilJ

• dna be^ij j&' h (D^mn^m t

• DNAffi#l/7 5 J afcgitfijtf)*^ n v-^j—f- - 7<

infix'-? A V \ 7-? £n > M j--7±-C"> 5 JL U-v 3

rommRgj a^9^#6(o^B#LTv^<2^c»59o

-136- #5 $-¥'-j hnoBM mk IMSfcJ«F£fc«fcS®m&S

AN 1) e¥‘>XTAX^©SUtti €®s r v s ©^esw-c *5. c® E##j@© ^ft

acttk X -f a y 7

B, «k s S> aioa j: Lr®S6H®gfc$*B69i famamz#. a&g ® * * $nrv tzbk d^&smiFKAirc, AEmcme?

#©%mtek3Ef set ^wtii f£ d > Pixt-f 7)

£tl5o

2) m&Z&ffi-fayj-K iKtmmmomt

A. E*£ 1) 6t@SSI 7 Vf&Tst ')- — >¥ es® * >r

2) yvii-7 h9-f ©#m

3) E#9WM IEI«|o»»i Lr yy A^comgc j: ^gsnsfcoii®#® »aas^Kt>»tt-r«. 0J x. (^glucocorticoid %0 ti/tifeSSWJ, dexametha- sone ; ligand^-S-SE 3? BX® agonist), tamo- xifenCpcS^J ; ligand# itElzSFIS^F® antago ­ nist), fiStiSMtamin DS (#Et&Er6EH; ligand #Am&#B?© ago- nist)#@E#^*m^ ffttU MKBUJWWe mmtirm&mm'p® thiazolidinedione #fc E^FB^PPPAB© agonist

tv 50 £ FK506 t> signal tras- duction pathwayL xmim^rntzK

-137- m a) Lrffi E#tf

t LT^jjLSnfcfccDI? to. %4 LT&SSftfck© ©#m

6. 4) &&?%#!:# earns esi@i6©^E»^Bfl $ tituf, mm# ©B3S^©—Oilt, me?©%@%E,%# mKMfa##WM?# aciicttD, mm, mmmmfimm#BEi^E® - ©SSf ic 4: a , J: ?+e?tac4#Mi

Lfcweie^KMt* transgenic animal Set' li knockout animal ©

a. b. mm 1) n&irm -B©ES mmmwm&vwz. 2) mmw mm®mfy? 3) WftHUKUfcSfflJlfi? #m46^k«a mm C6d#j@ 4) cDNAWm m&ma#6?©#%EE%SK#t'%#®Eibf a v - * - znK'pmomm&v aci » #BM9BSA, me?4#Wr~

—138 — (« &) E. ‘fbJS&DD i) m&mynotmt«isi± EEm©ft*eKA*t%gKtV'/:A ©#m, § to %m#$msm©m-T %

*&**©«& sEVii ±m 7ifi®|6)±^0^rv5 f. mm 1) • wa *mBK©W, lEffl mm 2) 3) IS* ©#%©i#^#lc 7 y^7S©Axm

4) i»^w • tm&m m*©f#%©#mi:«k3 7 1-7-7 ©fiJfflKJ:5S £W•^E#©AXM#4

5) #%©am y 7> l Aiee^e ##j#7”□ 7'7 A ©$Htc cfc f). am$#@RS%uci4»%©$$?, ii^^*M6EK

G. *E i) frwmmimvf&L dn Amm^^msw, (mmmmm x7 v —— y) ’ ©x 7 V —— y 7* H. SS 1) w# Wat:W^%*ef©K@6W#©## It I . x y 7 h 1) SS-fey-t#- mm®*, mmmt * n x-7 x 2) ^4 4-*? mm t j. am 1) i 2) ^tlT’a-fexii^i LT • *%me©A%# iW30«r>S£E tzkxmuwKk'o^jutgmz&M-? n-tXCi§Ux5„ 3) • #ss • w^nti^a© am&[f#AK«ko. a#©&a*t* *u a 6«cm#amT?me?w*%

4) amt-,-##, $ilfcrp> y 45 — 7 — nrt'So A Ax# % >7*4 Jiymmzmmt zct K«kD\ t-7-@#^#iimf6c6 nJtltCtt 6o k . mm 1) DNAfiM8£H 3-^-f y7'##©#E T-7'<-XiliK7Cit (t h) f 7 y/

2) V 7 h 7 * 7 /< v x yt£ 3) DBt-t'x —7 *##DB©m#/DBm# 4) t'X -SI5MJ& 5) 7 -"K5 h V • 4" — h > — y a y 6) A»t-7 £>;!§»)!:: mm »jfitftt c A&#»# 5.LV *, gmmmit 7) /4Xt#trf-7©M *%#&©«* flr&E 8) me?##f-7^-xESTs, *7-f v-y -f DB 9) me?##y^y-7 *aw% 10) WS, Super am?yy#& 11) BIO CAD mum amm?Nt= 4:3^4 4-m#©%#t 12) Molecular Machine 74 f'-f 7© A SrEStf^Stitti 13) DNA Computing 74 tN 7©» mt**7-+^7f +©*%

-139- S5S tltxf

^X174 7 7-y^y7Am&Bd^J (5kb, 1978^) <0&#W#x A7 7-Vx EB7

4 &Xx SiHK +M b 7 Jfvnyj ;vXx /\t7 -f ##x *Mx ±#@x #A

<^y7Aamj^&#x m^^by am## s, #c $ & & (#x c ^ ^ ## &aac*m L###c%^T ^

2:#m#T&i)x ###me?)Bl#6C2(=&&o

i) y7A#%(?)#^2:mmx 2) y7Am##mum

—a) y 7 anmc & ti &tm% b)

3) yny^7b^

"j b <7)#Wb£r:fi:vv7’n V' x. 7 b^GOJIlf^frofco

A#%ct±x #% b '>-^ >xr-7-e& sssam#^t^

#%#x &V'%e#6(W;e#^:bfLto mt)x m#&#DNA(o#

9-7(D#$r-c&»)x %#?#!%## fX o v ■> X ft tz o

3%l:x ##mmx * fK?mKmm%C"3V'T&#:&Lx bco^-cm&#ir6^ tZo

c9L&&fK±ii#x m#z#x b(±%*

cm&#"?& 0v'o

#ATj3 b x t O-Lx 7nyi^ b»^|p]x X-y-7 b £#SL

TVX 9 o

-140

1 . DN A v-f h7°'J> K£

7 v > h&li,

a#. 7vbyu>h& fflv\ # W/M11) *0 &M £ M $ii& *f£ X'hho fee*

^##6 -

DNA Wr li 7 V x 5 K (z ? u - > it $ ft tz Sp 1 *g£ §Mi £ ^ fr SV40 Is^SU ##M£# m&L-e, &9-^iikrtf->e^rf'6y7/f'7-^m^"epcR DNABfmi7Cy>#@24k$fitm$llf-X(:#'&L' Splh^ KB#, DNaseICZ^-e@B^%(:#»rL^o m^^-X(:^L^DNAKr#-(±%^#, LI-CORttModel 4000L DNAf-7J:>f-CZ ^m#*f$*"em^fWCZ l)^#L tzo SisMii, 2,000V, 30mA-e2.5 -3.5 RfKfrV', AD j: U=T- 7 16 If v ht-K(65,536l##)-ee-3^o W^tVtv hhy/f > j:o, Ltzo %^(7)%M#^e#(:Z6##gU#±300bp@K'e6c^

(Dirmz £i)fa i,ooobp (Dmtif&nm t%-? *(gi 1) = ttz, 5oobpMoDNA»r>t£ H^tzm&Ktt. ^ 2.5 Be -e 3fmol

m^-xt dna v -

#^m^fet%8o%^±m^i,ooobpmm[:^T(DW#iJ:v^ > t£#*&£#&&% -ejsi), v haLt!g^M##%?)y;%TAi8

-141- 2. 7 7 - v x < * y u -n= «t 5 □ - - > 7

a-z: >/ a%*<% o

One/Two-Hybrid System ^i±+^&ft|$rii)S:>5$S8f#X:' § & v Vc #x fr/i (-77 — 7t <

m $ # cDNA e m ^ -c 7 i y 7 v -e

& c a "C & & ^ %

^^v^#x.^ft&;77-yemv^T'^xyix/ryXTA (giz) &#BL"c. m%

GC-box(*4S®Efltr^L fcffltiE)$Sfc895 bp®iHli. DNASSKfflSa-5>®£®it-7'<-iULfc 37?m##en=7.t®?_ +$ j:V-li, ssssii.-eti-t-'h, spvDffi-ssmtSp19>/<(" %®eSTAj:U:## u -mins tire. GC*oxEiw-t,ll< B1 DNAy-^7t-IU5m'$#7?m>T^y B2 77{yfa7HU6m7^'j-#&g##7n-:>y

-142- 8

1. MBg9W#

PCRm%#mmK9J#44:#3L<&V'gmdi#A$%6ht'9, PCRKfp H jo it a Taq DNA polymerase 6DSS$ (fidelity) A(D 1 Otr&c^o bb.hi)K 1994¥3fiC, A phageDNA ^*Ii: Lfc^35kbp t ■ewJtfS^nrH’C

Long-PCR C&otzM&&&#JE tills 6### DNA polymerase $1 x. if Pfu DNA polymerase(Hyperthermophilic archaebacterium <£ t) ##5 jiZ-:Sr Lv^>if ####)"C^ %$ til: 0 ,Taq DNA polymerase =fc {) l(j 12 tBi® v' fidelity 7)5'f# BS £ KSM i-iP Z-fzbbl-bZ) 0SP h , Taq DNA polymerase C (± exonuclease ^ DNAm^-e-BK^cm^ tm&mtmm <7)a‘E:#5'E!F$ii&o'S"'l*C\#IES!fls£-t#'0 DNA polymerase £ ^'MiP x. h b mismatch 31### DNA^Wbna l 9l:^ao

>th- h(7)m$^m+kbs(7)DNA^*#j/-f

Long-pcRm(mm@g^j(7)#$%#$f#mhL'c%%-e6a

—143 — Long-PCRH t-d A

1.2 fTSchizosaccharomyces pombe fi3£o DNA SffM"J

(1)

H -b > 9 — •C'/“^x>x^#L“CV'S Schizosaccharomyces pombe <7) n x 5 K7

n — >(P65 ordered clone)<7) culture fr b 7 )V i: 'J <£ t) mm$L 7 x J — )V 9 n u ^71/A mm, Ilt-fey^Ai: ±&£E4|&i£i;:£*ft«£fTofcDNA£fflv>*:0

(2)

It)40kbp $ Tm ^ 76 - 84 c ^ & z 7 c 77/f T—tm+Lto 7°7^ v-^(^SittSS) 7x4 v-?)SSgd?iJ -k>X s 5’ tcacgcaggataccaaggctgatgttgtagat 3 ’ 7 t>X A1 (It) 7kbp) 5’ tgcgcggaacccctatttgtttatttttctaa 3 ’ A2 (It) 13kbp) 5’ ccttaagatggtatggaaatcgtttgcgttgt 3 ’ A3 (It) 13kbp) 5’ aacctatatgcattaaatacggaggcgaagta 3’ A4 (It) 18kbp) 5’ caaaatcgttgtcgcacaataagaccaatgga 3’ A5 (It) 18kbp) 5’ aaagtaatttggactgacgacgcctctctgta 3 ’ A6 (lt)23kbp) 5’ gctaacacctgagttatccgtttcgccagagt 3 ’ A7 (ItJ 23kbp) 5’ tcaaattcgttgtacgcatctgtcattactta 3’ A8(lt)28kbp) 5’ accttttgggatgttacgggtcttcctaatca 3 ’ A9(lt)28kbp) 5’ atggctactctccccttgattttactttgctc 3 ’ AlOdtJ 33kbp) 5’ gatctcgaacaaaagcatccatttccgttaca 3’ All(lt)33kbp) 5’ catgcaattcggcttttcgtttctccagttct 3 ’ A12(lt)38kbp) 5’ ttaagacctcagcaatgccccaaaacataaag 3’ (3) PCR(7)^f|: TAKARA LA PCR Kit £v>TLLT<7)^frxMf o tz

-144- ## (50ng/ n 1) ImI lOxLA PCR Buffer 5 u 1 dNTP mixture (2.5mM each) 5 y 1 Sense primer (20pmol/ // 1) 0.5 ju 1 Antisense primer (20pmol/ u 1) 0.5 n 1 TAKARA Ex Taq(5U//z 1) 0.5 n 1 d-H 2 0 37.5 // 1 total 50//1

GeneAmp T M PCR System 9600

98 °C lOsec i SOcycIes 68 r 15min 1 4 °C (4) am 'E-ft-etmPCRM £ 7 n - x y (01.) U PCR^jSWMkbs coS -s-^am, (^imu Schizosaccharomyces pombe 7) DNA ®fM") # 40kbs flStbJE $ 5 #"C& o/.:

-145 - electrophoresis : 150V, 2hr, 0.3%-agarose H , Buffer lxTBE , 2ul

lane a,i: Marker X /Hind 1 • EcoR I double digest 250ng (23.1kbp,9.4kbp,6.5kbp,4.4kb) lane o : Marker T4dc + T4dc/Bgl I digest 250ng (166.0kbp,40.8kbp) lane b : SxAl ,$! 7kbp > lane c : SxA2 ,1113kbp> lane d : SxA3 ,13kbpx lane e : SxA4 ,$118kbpx lane f: SxA5 ,1118kbp> lane g : SxA6 ,23kbpx lane h : SxA7 ,H 23kbp> lane j: SXA8 ,H 28kbpx lane k: SxA9 ,28kbp> lane 1: SxA8 ,1133kbp> lane m: SXA11,H 33kbp> lane n: SxA12,38kbp>

1.3 (MS Pyrococcus OT3 DNAEM*)

(1)

4 >-9 ‘— 1 -^-fL-^tLll 10,15,20kbp<7>S$ Pyrococcus OT3

> (JEiCGAlx 4A9x 4A10) culture b X X 9 X X KSB^E^EPI- lOOES-fflV'-cfllltflLfcDNASrfflv^o

(2) BACffl<7)^.-/'?--if;i/7,7 4 v-§rTm^'62~63(C^^ j: -) XX 31S, T XX- XrXX 3@E£BStL£0 ttz, IXTX) ,5 (co i <> iiit L (DpberoBAC A (-fe x xg.^'7 x X--b x X EX'Jlf) (% cross-hybridize L

—146 — &v->0 (cross-hybridize 4"& Rl"&6Tm7 A v —60g|5^<7)Tm (wit "<"C20°C m±#fLTV'&o) (DpGEM X% TlT's^--1 >X@££iJ^) (C cross-hybridize L & v' (cross-hybridize 4" & nJ1614 X>

(DGenBank bacteria database rh 1X4—60SSE154 XXF6 L & v10 ©GenBank bacteria database 4160 cross-hybridize © & W1614£> & & SSEd^lJS£h

©7 X-7-k > X (4#%<7)Not I "9© 1 Xr&6 position29 6 positionlOO i "CdC-fSiEo ©■k >X (4##(DNot 1X4 1 position6781 b position6000 ^ fio

Xx 4 v-

-147- (3) PCRO&# TAKARA LA PCR Kit Srffli'T&To

mm do 6 ceii///1) 1 Ml 10xLA PCR Buffer 5 fi I dNTP mixture (2.5mM each) 5 u 1 Sense primer (20pmol/ u 1) 0.5 nl

Antisense primer (20pmol/// 1) 0.5 m 1 TAKARA Ex Taq(5U//z 1) 0.5 /u 1 d-H 2 0 37.5 ju 1 total 50 ju 1

GeneAmp T M PCR System 9600

95 °C 4min 1 94 1 15sec 1 SOcycles 65 C 12min------1 4°C

—148 — (4) a&m (1212.) L, PCR±##(D;<>K^ SlxA2%(7S2xA2^m*^t,-^^

m2.7°y 4 '?--&coMM

electrophoresis : 150V,lhr50min,0.3%-agarose H,Buffer lxTBE,5ul Template : BAC6A1 (l-J lOkbp) lane a: Marker A/Hind H digest, 250ng(23.1kbp,9.4kbp,6.5kbp,4.4kbp) lane b : d-H 2 O > lane c : control, lane d : SlxAl, lane e : SlxA2 , lane f: SlxA3 , lane g : S2xAl > lane h : S2xA2 , lane i: S2xA3 , lane j : S3xAl > lane k : S3xA2 > lane 1: S3xA3

BAG clones CO insert RV1'/ y 4 v — 9 > Template t Lf

—149 — 6A1(# lOkbp), 4A9(l>jl5kbpX 4A10(^20kbp)^y iV9 4 v- SlxA2, S2xA2 <7)E& ^t,-treLong-PCR$-^v\ ffif%#PCR#%& 7 #n-xy (103.) L #m:#LtPyrococcusOT3<7)/f>'f-|'##m

m 3.pcr

electrophoresis : 150V, lhr50min,0.3%-agarose H,Buffer lxTBE,5ul lane aj : Marker A/Hind E digest, 250ng(23.1kbp,9.4kbp,6.5kbp,4.4kbp) lane b : d-H 2 O, lane c : control, lane d : 6A1 (% 10kbp),SlxA2, lane e : 6A1 % 10kbp),S2xA2, lane f: 4A9 (# 15kbp),SlxA2> lane g : 4A9 (% 15kbp ,S2xA2, lane h : 4A10(# 20kbp),SlxA2, lane i: 4A10($tl20kbp),S2xA2>

-150- 1.4 ##

> Zfrmz'f? 4 -r-frft&ir&ztizl: b Elf %±gl@Lfcd $, £j 40kbp t Tl# Is ds M E -C&o/io CLfUi, nx 5 KcolfXE)tS (l-j40kbp) nx 5 -tfx"? u-->x"L&v'-eiM$g-e§& £ k£/SitL-ci5b, cz b 40kbp^T^)DNAem#y-^J:>%i-6caME'e&&a#x.^fL&o Ml:, DNA it: X blfAEfr^ltSLAdC lb20kbp^-e±i^^BlE'efeo^o Ld'L, JfAEJtt:

-fv 4 -x-d'T--V ^54 v - Txd*$^I(0#'^(:6i@M'C#62id^6d'i^c)d:o -^<0^86, mA6 2id##C^6i#x^ft6.

2. mttm&Ykmmiwmntz*>

2.1 mmSg^!l f - ? d' & <0###^ <0#^^ if 0 C ti#%r & ###& 6 d*, M 8 5tMW-^(Schizosaccharomyces pombe ) X £ if v\ ORFWL

2.2 im&@d?ij-r--? Yu / TYHWfr b left arm#JCfAilA 6 3 x < KX n — >i:oo fit: Shotgun Sequence (:<£ o AtMijSE^J^t&JE^if o tza ^j3, x-^>i-(iABI(0373A&i^377e#mL, l^x< F^d:b 800-900y-^ xxf-f&fHv'TT 7t^y&irifcd:o

2.3 i#ibfLASSSd^ijT — 9 {I'OV'tr, BlastXx Smith Waterm-an (: j: 61= t o v — #iS: RU Coldspring Harbor <0 Dr. Michael Zhang 69 HH LfzfrMW-^W-^V) Gene Finding &if o 7 7 1 ?&& INTRON. PLOT #%

—151 — 2.4 ##&###

(1)

L ^ <7)38,oo7bp -c & c ^o i± 0.02 % j^.T -e* & 2:#/E$fi6o VMlredundance (± 9 Tr& •& 0 (2)

ti^i'o 'c<7)tpl~ mannosyltran - sferase, cuitinase^ cell division control protein 2 t (0^*c tzo zfih

LTv-f >X&fTct;K HxSo

h o > LC < v^^##&powbe -C6 ^ T t # ^-e#tmmL, i±Sffl-e§ Si-T##

-152- B'J#1 Smallest Poisson Results of INTRON.PLOT Reading High Probability DB LOCUS ACCESSION High-scoring Segment Pairs Frame Score P(N) N Length ipOl pir 508118 S08118 histone HZA.vD - fruit fly (Drosophila melanogaster) 415 5.10E-49 1 141 ip02 pir A23359 TVZP2 cell division control protein 2 - fission yeast 1549 2.70E-206 1 297 (Schizosaccharomyces pombe) ip03 pir JQ1696 JQI696 pistil extensin-like protein precursor (clone pMGI 5) -common 75 4.30E-14 4 426 tobacco ip04 pir Si 9510 5195)0 hypothetical protein YCR094w - yeast (Saccharomyces cerevisiae) 469 1.50E-56 1 391 ip05 pir $38170 538170 SRP40 protein - yeast (Saccharomyces cerevisiae) 52 0.098 3 406 ip06 pir A39205 A39205 nuclear localization sequence-binding protein NSR1 - yeast 74 1.50E-09 3 4)4 (Saccharomycescerevisiae) ip07 pir 512797 512797 ribosomal protein 515 precursor, mitochondrial - yeast 206 1.30E-19 1 286 (Saccharomycescerevisiae) ipOB pir 505808 BWBYDL RAD50 protein - yeast (Saccharomyces cerevisiae) 53 0.37 4 1312 ip09 pir S41584 541584 his 3 protein - fission yeast (Schizosaccharomyces pombe ) 1985 3.60E-267 I 384 ipIO pir 531479 531479 sucrose synthase (EC 2.4.1.13) - fava bean 88 0.0027 1 806 ip! 1 pir JQ0703 JQ0703 UDPglucose-starch glucosyitransferase (EC 2.4.1.11)- rice 84 1.50E-08 2 609 ip12 pir A26836 A26836 actin - Fission yeast (Schizosaccharomyces pombe) 1957 4.40E-262 1 375 ip13 pir S36075 536075 fkh-6 protein - mouse (fragment) 241 1.10E-23 1 111 ip14 pir A38197 A38197 cholinesterase-related cell division control protein CHED -human 274 7.80E-58 3 418 ipl 5 pir B39654 639654 cell cycle arrest protein BUB3 - yeast (Saccharomyces cerevisiae) 61 0.0005 2 341 ip16 pir 549795 549795 alpha-1,2-mannosyItrans(erase homolog - yeast (Saccharomyces 321 3.00E-56 3 517 cerevisiae) ipl 7 GPDAT YSPGPH_1 L28061 Schizosaccharomyces pombe Eukaryotae; mitochondrial -i1 1607 1.70E-217 1 317 eukaryotes; Schizosaccharomyces pombe gene, complete cds. 8/94 8'Jm2 Results of Blastx

Smallest

Reading .-High Probability DB LOCUS ACCESSION Sequences producing High-scoring Segment Pairs: Frame Score P(N) N Length blaxO! GPDAT S74633.1 S74633 Schizosaccharomyces pombe Unclassified, phtl -histone H2A +2 873 5.90E-1 16 1 171 variant [Schizosaccharomyces pombe=fission yeast. Genomic, 1400 ntj. Method: conceptual translation supplied by author. blax02 GPDAT YSPCDC2_1 Ml 2912 Schizosaccharomyces pombe Eukaryota; Fungi; Ascomycota; -2 547 5.40E-162 3 297 Yeast (S.pombe) cell division gene (CDC2), complete cds. CDC2 protein kinase; NCBIgi: 173359. 5/87 blaxOS GPDAT SCCHRIII.200 XS9720 Saccharomyces cerevisiae Eukaryotae; mitochondrial + 1 469 2.10E-84 3 391 eukaryotes; S.cerevisiae chromosome Ml complete DMA sequence. YCR094w,len:391 ; pid:g5483; NCBI gi: 5483.6/95 +3 1061 1.20E-158 3 384 blax04 GPDAT YSPHIS3A.1 L19523 Schizosaccharomyces pombe Eukaryota; Fungi; Eumycota; Ascomycotina; Schizosaccharomyces pombe imidazoleglycerol- phosphate dehydratase (his3) gene sequence. blaxOS GPDAT SCDNAALG2.1 X87947 Saccharomyces cerevisiae Eukaryotae; mitochondrial -2 219 1.80E-78 6 529 eukaryotes; S.cerevisiae ALG2 gene. Glycosyltransferase; Pid:e184159; SGD: L0002798; NCBI gi: 871531.6/95 blaxOG GPDAT SPAC23D3.1S 2643S4 Schizosaccharomyces pombe Eukaryotae; mitochondrial +1 544 1.30E-107 3 1204 eukaryotes; S.pombe chromosome I cosmid c23D3. Unknown; partial orf, ten: > 1204, most similar to SW AMY_STRU QOS884 alpha-amylase precursor (24; 1 % identity in 291 aa overlap). blax07 GPDAT $PACT1_1 Y00447 Schizosaccharomyces pombe Eukaryotae; mitochondrial *1 1169 2.80E-157 1 37S eukaryotes; Schizosaccharomyces pombe actt gene tor actm. Actio (AA 1 -375); NCBIgi: 4900.9/93 blaxOS GPDAT RATHFH2_1 Li 3202 Rattus norvegicus Eukaryota; Animalia; Chordata; -2 237 6.70E-25 1 101 Vertebrata; Rattus norvegicus HNF-3/tork-head homolog-2 (HFH-2) mRNA,complete cds. NCBI gi: 310155. 6/93 -1 218 S.30E-21 1 532 blax09 GPDAT SCCHRIIL161 XS9720 Saccharomyces cerevisiae Eukaryotae; mitochondrial eukaryotes; S-cerevisiae chromosome 18 complete DMA sequence. ORF YCR065w homology to Drosophila fkh homeotic gene, JOS 1 blaxlO GPOAT CEB0285-2 Z34S33 Caenorhabditis elegans Eukaryotae; mitochondrial eukaryotes; -2 213 2.00E-78 S 372 Caenorhabdttis elegans cosmid B0285. 80285.1; Similar to CDC2 like protein kinase; pid:e; NCBI gi:l 066453.11/95 blaxll GPDAT SC9910_10 246728 Saccharomyces cerevisiae Eukaryotae; mitochondrial eukaryotes; + 1 321 2.40E-75 3 SI 7 S.cerevisiae chromosome IX cosmid 9910. YI9910.11c, orf similar to KTR1, putative mannosyltransferase len: 517, CAL Q;18. BU#3 Results of Smith-Waterman

DB LOCUS Strd ZScore 1Orig Length ACCESSION Documentation swl GB_PL S74633 + 229.7 1000 1400 S74633 phtl =histone H2A variant [Schizosaccharomyces pombe=fission yeast, Genomic, 1400 nt]. sw2 GB_PL YSPCDC2 239.7 1000 1687 Ml 291 2 Yeast (S.pombe) cell division gene (CDC2), complete cds. 5/87 sw3 GB_PL SCYNR048W + 37.97 176.6 1818 Z71663 S.cerevisiae chromosome XIV reading frame ORF YNR048W. 5/96 sw4 GB_PL YSPHIS3B + 258 1000 2196 LI9524 Schizosaccharomyces pombe imidazoleglycerol -phosphate dehydratase (his3) gene, complete cds. sw5 GB_PL SPAC23D3 + 69.24 245.5 42037 Z64354 S.pombe chromosome 1 cosmid c23D3. 10/95 sw6 GB_PL SPACT1 + 154.1 979 1528 Y00447 Schizosaccharomyces pombe act! gene for actin. 9/93 sw7 GB_PL YSPSENSD + 37.29 143.7 215 L09641 Schizosaccharomyces pombe endonuclease hypersensitive site related DNA sequence. sw8 GB_PL YSCSGV1 - 37.22 155.2 2894 D90317 S.cerevisiae SGV1 gene for SGV1 kinase. 6/91 sw9 GB_PL SCCHRIX_2 + 33.38 154.4 11 0000 Z47047 Continuation of SCCHRIX from base 200001 (Z47047 S. cerevisiae chromosome IX complete sequence. 4/95) swlO GB_PL yspgph + 226.2 984.2 2583 L28061 Schizosaccharomyces pombe gene complete cds.8/94

153- S.pombe gene map Okb lOkb 20kb 3 Okb 38kb cdc2

cosmid

att blastx i^| b \ bn5 phtl CDC2 YCR094W his3 ALG2 act! HFH-2 | KTR1 YCR065W intronprot ^ ^ » m Phtl CDC2 his3 unknown act! fkh-6 YAH3 YII5 smith- #■ * waterman phtl CDC2 YNR048W his3 act! SGV1

S.pombe gene S.cereviciae gene I = intron = other species ' =low homology homologue homologue $rjL^)]/3?— • El

% # 03-3987-9355 FAX 03-3981-1536