<<

„ow—rd €—thw—y ingineeringX

e xew h—t—˜—se of qeneti™ —nd

wole™ul—r €—thw—ys

winoru u—nehis—

snstitute for ghemi™—l ‚ese—r™hD uyoto niversity

ing of —nd mole™ules is prom qenome ƒeE sequen™ing pro je™ts h—ve ˜ een

—lso r—pidly exp—nding owing to ™ompleted for ˜udding ye—st @IP

quen™es to pun™tions

the —dv—n™ement of exp erimenE w˜pA —nd for sever—l ˜—™teri— inE

t—l te™hnologies in the ˜ro—d ™luding ™y—no˜—™teri— @QFT w˜pA

„he rum—n qenome €ro je™t

—re—s of mole™ul—r —nd ™ellul—r whi™h w—s ™—rried out ˜y the

w—s initi—ted in the l—te IWVHs

˜iologyF sn order to m—ke full u—zus— hxe ‚ese—r™h snstitute

—s — n—tur—l ™onsequen™e of

use of the inform—tion o˜t—ined in t—p—nF

the te™hnology developments in

˜y pro je™tsD it is essenE

por the rst time in hum—n mole™ul—r ˜iology —nd with the

ti—l th—t su™h fun™tion—l d—t—

history we —re ˜ eginning to h—ve expe™t—tion of new ˜iomedi™—l

—re prop erly ™omputerized in

the d—t— —t h—nd whi™h le—ds toE —ppli™—tionsF „he pro je™t will

d—t—˜—ses —nd inform—ti™s te™hE

w—rd — ˜—si™ underst—nding of un™over the ™omplete hxe seE

nologies —re developed for fun™E

quen™e of the hum—n genome the fund—ment—l pro˜lems in life

tion—l predi™tionF

™onsisting of Q ˜illion ˜—se p—irs s™ien™es in™luding the origin —nd

—nd IHH thous—nd genesF sn evolution of life —nd the ™onE

prom qene q—t—logs

IWUUD — sm—ll virus genomeD ™eption —nd developmentof—n

0xIURD ™onsisting of just SDHHH individu—lF „he d—t— will —lso

to €—thw—ys

˜—se p—irs —nd II genes w—s deE stimul—te pr—™ti™—l —ppli™—tions

„he fun™tion—l d—t— th—t rel—te termined ˜y the emerging te™hE in medi™—lD ph—rm—™euti™—lD —nd

to sequen™e inform—tion —re ™urE nology of hxe sequen™e deterE —gri™ultur—l s™ien™esF roweverD

rently stored —s —nnot—tions to min—tionF efter two de™—des somewh—t ™ontr—ry to pu˜li™ noE

sequen™e d—t—D for ex—mpleD in of te™hnology developments the tionD the sequen™e d—t— o˜t—ined

the soE™—lled fe—tures t—˜lesD in ˜y genome pro je™ts do not ˜y rst ™omplete genome of —

the sequen™e d—t—˜—ses of hxes themselves provide dire™t —nE freeEliving ˜iologi™—l org—nismD

—nd proteinsF roweverD these swers to su™h fund—ment—l pro˜E r—emophilus inuenz—eD w—s deE

˜—si™—lly represent the sequen™eE lems or pr—™ti™—l —ppli™—tionsF termined in IWWSF „he ˜—™teriE

fun™tion rel—tionships of single „he sequen™ing of — genome is —l genome ™onsisting of IFV milE

mole™ulesD iFeFD the individu—l —n e—sier p—rt th—n the underE lion nu™leotides —nd IDUHH genes

™omp onents of — ˜iologi™—l sysE st—nding of fun™tion—l impli™—E is —lre—dy followed ˜y the exE

temD —nd they do not ™ont—in tions of whenD whereD —nd how plosion of ™omplete genomi™ seE

higher level inform—tionD iFeFD genes —nd mole™ules fun™tion in quen™es of — num˜er of org—nE

wiring di—gr—msD of geneti™ inE living org—nismsF portun—telyD isms from ˜—™teri— to euk—ryE

ter—™tions —nd mole™ul—r inter—™E our knowledge of the fun™tionE otesF es of mid IWWTD the genome

tionsF st is o˜vious th—t withE th—t is represented ˜y wh—t is p—thw—ysD —lthough we intend

out su™h wiringEdi—gr—ms — ˜ioE termed — ˜in—ry rel—tionF en to ™omputerize other p—thw—ysD

logi™—l system ™ould never ˜ e deE exp ert in the eld would syntheE su™h —s sign—l tr—nsdu™tion —nd

s™ri˜ed or understo o dF size — p—thw—y from — ™olle™tion ™ell ™y™le p—thw—ys —nd the

of ˜in—ry rel—tions o˜t—ined ˜y geneti™ p—thw—ys of the e—rly

‡e h—ve thus initi—ted —

exp eriment—l o˜serv—tionsF sn st—ges in fruit y developmentF

pro je™t n—med uiqqD uyoto

order to ™op e with the r—pidE ‡e exp e™t th—t on™e su™h d—t—

in™y™lopedi— of qenes —nd

ly exp—nding ˜ o dy of inform—E —re prop erly ™omputerizedD it

qenomesD to ™omputerize the

tion it is ne™ess—ry to ™omputE will ˜ e™ome fe—si˜le to —ssist exE

™urrent knowledge of mole™ul—r

erize the pro ™ess of synthesizing p erimentsD f—™ilit—te underst—ndE

—nd ™ellul—r ˜iology in terms of

p—thw—ysD in —ddition to ™omE ingD —nd even p erform logiE

the inform—tion p—thw—ys th—t

puterizing known p—thw—ys deE ™—l simul—tions of inform—tion

™onsist of inter—™ting genes or

rived ˜y hum—n exp ertsF p—thw—ys ™ontrolling —ll —sp e™ts

mole™ulesF „he ˜—si™ d—t— item

of living org—nismsF

in uiqq is — p—irwise interE gurrentlyD we —re fo ™using

—™tion of genes or mole™ules our —ttention on the met—˜ oli™

pigure IX „he home p—ge of the qenomexet ‡‡‡ server —t httpXGGwwwFgenomeF—dFjpG —nd the

hfqi„ —nd uiqq se—r™h windowsF

qenomexet h—t—˜—se qenomexet under the qenome „he —im of qenomexet is not

snform—ti™s €ro je™tD — p—rt of the simply — network ™onne™tionY it

ƒervi™e

rum—n qenome €rogr—m of the is to est—˜lish the inform—ti™s

winistry of idu™—tionD ƒ™ien™eD infr—stru™ture for genome reE

sn IWWI we est—˜lished —

ƒp orts —nd gulture @won˜ushoAF se—r™h —nd rel—ted rese—r™h —re—s

™omputer network n—med

in mole™ul—r —nd ™ellul—r ˜ioloE ƒ™ien™eD the niversityof„okyoD ed in the FƒF or iurop eF iven

gyF sn view of the su˜sequent h—s gre—tly ™ontri˜uted tow—rd the d—t—˜—ses whi™h ™l—imed to

government funding of snternet th—t endF ˜ e org—nized in t—p—n —™tu—lly

—™tivities in t—p—nD we —re only he—vily dep end on the systems

„he most p opul—r mo de

™urrently m—int—ining the ™onE —nd proto ™ols develop ed in other

of —™™ess to the qenomexet

ne™tion ˜ etween „okyoD uyoto ™ountriesF

d—t—˜—se servi™e is to use our

—nd pukuok—F yrigin—llyD we enE uiqq is —n —ttempt to —dE

‡‡‡ server shown in pigF ID

visioned — network ™ommunity v—n™e our origin—l ™on™epts —nd

whi™h provides —mong othersD

where the inform—ti™s needs of te™hnologiesD —nd —™tu—l d—t—

the hfqi„ integr—ted d—t—˜—se

individu—l rese—r™hers —nd indiE ™olle™tion eortsF elthough

retriev—l system —nd the seE

vidu—l pro je™ts would ˜ e re—lized we h—ve not yet m—de — forE

quen™e interpret—tion to ols inE

on their lo ™—l m—™hines ˜y inteE m—l —nnoun™ement of uiqqD

™luding sequen™e simil—rity —nd

gr—ting d—t—˜—ses —nd ™omput—E — prelimin—ry version h—s ˜ een

motif se—r™h progr—msF „he

tion—l resour™es distri˜uted over pu˜li™ly —v—il—˜le through the

server re™eives tens of thous—nds

the networkF ‡e ˜ elieve th—t the qenomexet ‡‡‡ server sin™e

of queries p er d—yD oneEthird of

wideEr—nging d—t—˜—se servi™e he™em˜er IWWSF „he t—rget d—te

whi™h —re from —˜ro—dF elE

in qenomexetD whi™h is jointly of the rst rele—se of uiqq is

though the qenomexet d—t—˜—se

provided ˜y the ƒup er™omputer y™to˜ er IWWTF ‡e pl—n to disE

servi™e is — result of te™hnologE

v—˜or—tory of the snstitute for tri˜ute ™omp—™t dis™s in —ddition

i™—l developments in t—p—nD for

ghemi™—l ‚ese—r™hD uyoto niE to the servi™e over the snternetF

ex—mpleD hfqi„ w—s develop ed

versity —nd the rum—n qenome

in my l—˜ or—toryD most of the

genter of snstitute of wedi™—l d—t—˜—ses th—t we oer origin—tE

Pathways

Binary relations KEGG Hierarchies

Genes Molecules

LIGAND

Genome OMIM DBGET CAS Databases

DNA/ Databases

Medline

pigure PX „he ™on™ept of uiqq —nd its rel—tion to hfqi„F sn uiqq fun™tion—l —sp e™ts of genes

—nd mole™ules —re represented ˜y ˜in—ry rel—tionsD hier—r™hiesD —nd p—thw—ysF

˜ e retrieved uniformly —nd links „he gon™epts „he ™on™ept of rel—tion —nd

—re m—de ˜ etween rel—ted entries dedu™tion is thus the ˜—sis

in dierent d—t—˜—sesF sn this of our uiqq pro je™tF por

gurrentlyD uiqq is ™omp osed

lo oselyE™oupled integr—tionD the our logi™E˜—sed —™tivitiesD we

of three inter™onne™ted se™tionsX

s™hem— or the form—t of how—n —™knowledge the p—st ™oll—˜ or—E

p—thw—ysD genesD —nd mole™ulesD

entry is org—nized ˜y d—t— items tions with the pifth qener—tion

whi™h —re —lso linked to —

is left to e—™h d—t—˜—seF „his gomputer €ro je™t te—m memE

num˜er of existing d—t—˜—ses

should ˜ e ™ontr—sted with using ˜ ers in sgy„ —nd the rese—r™hers

through hfqi„ @pigF PAF

the s—me rel—tion—l d—t—˜—se —nd in the qenome snform—ti™s ‚eE

foth ™on™eptu—lly —nd pr—™E

enfor™ing — unied s™hem— for se—r™h €ro je™ts IWWIEIWWS —nd

ti™—llyD uiqq —nd hfqi„

entries ™oming from m—ny dierE IWWTEPHHHF

—re tightly ™oupled systemsF

ent sour™esD whi™h ne™ess—rily inE

hfqi„ provides —n integr—ted

volves the pro ™ess of d—t— ™onE

view of v—rious d—t—˜—ses in

„he „e™hnologies

versionF

mole™ul—r ˜iologyD where the ˜—E

„he prolifer—tion of ‡‡‡

sis of integr—tion is the link @˜iE

uiqq m—kes full use of the

w—s — ˜ o on to our —ppro—™h

n—ry rel—tionA ˜ etween rel—ted

—dv—n™ements in the d—t—˜—se

of lo ose integr—tionY the link

entries in dierent d—t—˜—sesF sn

—nd networking te™hnologyD inE

™—p—˜ilities of hfqi„ t the

uiqqD —n org—nism m—y ˜ e

™luding dedu™tive —nd o˜ je™tE

me™h—nism of ‡‡‡ very ni™eE

™onsidered — d—t—˜—se of genes

oriented d—t—˜—sesD the multimeE

lyF „he textE˜—sed hfqi„ sysE

—nd pro du™tsD —nd the link

di— environment of ‡‡‡D —nd

tem w—s e—sily extended to the

˜etween them is used for synE

the mo˜ile —gentD t—v—F elE

multimedi— environmentD where

thesizing — p—thw—yF „husD

though we m—int—in the logi™E

Qh gr—phi™sD Ph gr—phi™sD —nd

˜oth uiqq —nd hfqi„ ™onE

˜—sed form—lism of rel—tion —nd

im—ges —re now retriev—˜le in

t—in —n —sp e™t of the dedu™tive

dedu™tionD we t—ke — pr—™ti™—lD

the ‡‡‡ version of hfqi„F

d—t—˜—se where new rel—tions

exi˜le —ppro—™h in the —™tu—l

sn —dditionD ˜ e™—use the up d—te

™—n ˜ e dedu™ed from rel—tions

implement—tionF por ex—mpleD

pro ™edure do es not involve d—t—

stored in the d—t—˜—seF

we use the gy‚ev dedu™tive

™onversionD hfqi„ h—s ˜ een

enother imp ort—nt ™on™ept

d—t—˜—se system for exp erimentE

—nd will ™ontinue to ˜ e —˜le to

in uiqq is the hier—r™hy th—t

ing the dedu™tion pro ™essD ˜ut

™op e with the ever in™re—sing

represents fun™tion—lD stru™tur—lD

in the —™tu—l implement—tion of

num˜er —nd volume of d—ily upE

—nd evolution—ry rel—tionships of

uiqq we h—ve developed our

d—ted d—t—˜—sesF

genes —nd mole™ulesF por exE

own gCC li˜r—ry for m—nipul—tE

uiqq inherits —ll these

—mpleD the degree of simil—rity

ing ˜in—ry rel—tions —nd hier—rE

hfqi„ ™—p—˜ilitiesF purtherE

in sequen™es —nd Qh stru™tures

™hiesF

moreD the gr—phi™s h—ndling of

of is used for ™l—ssifyE

‡ith — simil—r philosophyD

p—thw—y di—gr—ms —nd ™hromoE

ing sup erf—milies —nd Qh foldsF

hfqi„ do es not dep end on

some m—ps h—s ˜ een implementE

„he t—xonomy is the ™l—ssi™—E

—ny d—t—˜—se m—n—gement sysE

ed ˜y t—v—F „he new ™—p—˜ilities

tion of org—nismsD whi™h is imE

temF „he entire system h—s

of logi™—l inferen™e —nd simul—E

port—nt in extending sequen™e

˜ een developed in houseF e™tuE

tion —re still under developmentD

—nd Qh stru™tur—l simil—rities to

—llyD hfqi„ h—s its ro ots in the

˜ut we hop e to m—ke the rst

fun™tion—l simil—ritiesF „hese

shieƒ sequen™e —n—lysis p—™kE

test version —v—il—˜le shortlyF

—nd other ™l—ssi™—tions —re t—kE

—ge th—t s developed in the e—rly

en from —ppropri—te sour™es —nd

IWVHs in the FƒF x—tion—l snstiE

implemented in uiqqF

tutes of re—lthF hfqi„ —ims —t

h—t— golle™tion ifE

integr—ting dierent d—t—˜—ses ‡hile the ˜in—ry rel—tion

—nd dierent types of d—t— in represents —tD horizont—l reE

forts

mole™ul—r ˜iologyF „he integr—E l—tionshipsD the hier—r™hy repE

rowever e™iently the system tionD howeverD is —t the level of resents verti™—l rel—tionshipsF

is developed —s — d—t—˜—seD the d—t— entriesY for ex—mpleD enE foth —re n—tur—lly integr—ted

most ™riti™—l thing is the qu—liE tries in dierent d—t—˜—ses ™—n in the pro ™ess of dedu™tionF

or there is —n unknown re—™E for there is — ™hemi™—l ˜—sis of ty of d—t— th—t it ™ont—insF isE

tion p—thw—y th—t utilizes dierE the ˜in—ry rel—tion ˜ etween — pe™i—llyD sin™e the d—t— we h—nE

ent in the ™—t—logF por su˜str—te —nd — pro du™tF uiqq dle require ˜iologi™—l knowledge

the l—tter p ossi˜ilityD the dedu™E ™ont—ins the su˜str—teEpro du™t in sp e™i™ dom—insD qu—lity ™onE

tion from ˜in—ry rel—tions of su˜E rel—tionships —nd the rel—tionE trol is f—r more di™ult th—nD for

str—tes —nd pro du™ts is usefulF sn ships of two ™onse™utive enE ex—mpleD hxe —nd protein seE

—ny ™—seD the p—thw—y inform—E zymes th—t —pp e—r in the known quen™e d—t—˜—sesF emong the

tion is ™riti™—l in the nding —nd met—˜ oli™ p—thw—ysF m—ny dierent su˜ je™ts of mole™E

fun™tion—l —ssignment of genes in yne of the m— jor o˜ je™E ul—r —nd geneti™ p—thw—ysD the

the genome pro je™tsF met—˜oli™ p—thw—ys —re pro˜—E

tives of the uiqq pro je™t is

˜ly the e—siest to ™omputerize

to link the stru™tur—l d—t— @gene

es in the sequen™e —lignment

˜e™—use of the wellEest—˜lished

™—t—logsA o˜t—ined ˜y genome

—nd Qh stru™ture —lignmentD the

knowledge —nd existing ™ompiE

pro je™ts —nd the fun™tion—l d—t—

p—thw—y —lignment will ˜ e™ome

l—tionsF ‡e h—ve ˜ een ™omE

o˜t—ined in sp e™i—lized elds of

—n imp ort—nt to ol to identify

puterizing met—˜ oli™ p—thw—ysD

mole™ul—r —nd ™ellul—r ˜iologyF

glo˜—l —nd lo ™—l simil—rities ˜ eE

in ™oll—˜ or—tion with —n exp ert

yn™e the genome sequen™ing is

tween two p—thw—ys or — ™onE

in ˜io ™hemistryD mostly from

™ompleteD it is ™ustom—ry to —tE

sensus —mong m—ny p—thw—ysF

tempt to ™l—ssify —ll genes —™E the fo ehringer w—ll ™h—rt —nd

por ex—mpleD the ™omp—rison of

the ™ompil—tion ˜y the t—p—nese

™ording to their fun™tionsD for exE

org—nismEsp e™i™ p—thw—ys will

—mple in the s™heme develop ed fio™hemi™—l ƒo ™ietyD —nd p—rtly

identify fun™tion—l simil—rities

from other text˜ o oks —nd onEline

˜y woni™— ‚ileyF ‡e pl—n to

—nd dieren™esD —s well —s evoE

d—t—˜—sesF

m—ke — more o˜ je™tive ™l—ssi™—E

lution—ry rel—tionshipsD ˜ etween

tion ˜—sed on the p—thw—y d—t—

org—nismsF fe™—use p—thw—y uiqq ™urrently ™ont—ins

˜ eing enteredF et the momentD

d—t— —re linked to — diverse r—nge most of the known met—˜ oli™

we only p erform the ™l—ssi™—E

of d—t— in uiqq —nd hfqi„D p—thw—ys represented ˜y —˜ out

tion of genesF

they ™—n ˜ e —n—lyzed in m—ny VH gr—phi™—l di—gr—msF en enE

dierent p ersp e™tivesF por exE zyme is — ™li™k—˜le o˜ je™t in the

—mpleD ˜y ex—mining where the di—gr—m to retrieve the ™orreE

puture hire™tions

enzymes in the s—me op eron —pE sponding entry of the vsqexh

p e—r on the p—thw—y will give inE d—t—˜—se —nd thenD through

€erh—psD the most ™h—llenging

sights into the regul—tion of gene hfqi„D — num˜er of rel—ted

t—sk of the uiqq system is the

expressionD —s well —s the evoE entries in dierent d—t—˜—sesF

inferen™e ™—p—˜ilities th—t will

lution—ry impli™—tions of gene vsqexh is — d—t—˜—se of enE

help hum—n ˜ eings to m—ke logE

stru™turesF zyme re—™tions —nd met—˜ oli™

i™—l re—soning pro ™essesF „hese

™ompounds th—t we org—nize in

™—p—˜ilities h—ve not yet ˜ een

„he p—thw—y ™omput—tions

— sep—r—te pro je™tF st proE

developed or implementedD ˜ut

des™ri˜ ed —˜ ove will h—ve diE

vides links ˜ etween the new

here —re some ex—mples @pigF QAF

re™t pr—™ti™—l —ppli™—tionsD

€e„r‡e‰ d—t—˜—se —nd the exE

whi™h m—y ˜ e ™olle™tively ™—lled qiven — list of enzymes @ig

isting d—t—˜—ses of nu™leotide seE

p—thw—y engineeringF por ex—mE num˜ersA th—t —re found in the

quen™esD —mino —™id sequen™esD

pleD from p—thw—y ™omp—risons gene ™—t—log of —n org—nismD

Qh stru™turesD sequen™e motifsD

—nd —n—lysesD —n ee™tive p esE uiqq —utom—ti™—lly gener—tes

—mino —™id mut—tionsD geneti™

ti™ide or — sideEee™t free the org—nism sp e™i™ p—thw—ys

m—psD geneti™ dise—sesD —nd litE

m—y ˜ etter ˜ e designedF qeneti™ ˜y m—rking the enzymes foundF

er—tureF

engineering w—s ˜—sed on the „henD the ™onne™tivity —nd ™omE

hxe sequen™e inform—tionD —nd pleteness of m—rked enzymes ™—n sn —ddition to ™omputerizE

protein engineering w—s ˜—sed on ˜ e used to —ssess the ™orre™tE ing known p—thw—ysD we —re deE

the protein Qh stru™turesF ‡ith ness of fun™tion—l —ssignments in veloping metho ds to ™ompute

the —v—il—˜ility of new types of the gene ™—t—logF „he existen™e p—thw—ys from ˜in—ry rel—tionsF

d—t— on the wiringEdi—gr—ms of of — missing element implies eiE „he met—˜ oli™ p—thw—y is ˜ est

living systemsD p—thw—y engiE ther the gene ™—t—log is wrong suited for this purp ose —s wellD

neering is ˜ ound to emerge —s — new ˜iote™hnology in the PIst ™enturyF

Genome projects Biological knowledge

Binary relations and hierarchies Gene catalogs

Path computation Deduction from binary relations; gene finding and prediction.

Pathway comparison Similarities/differences with respect Pathways to /environment.

Pathway analysis Duplication of genes; relationships with operons; etc.

Pathway engineering Design of new pathways and

compounds.

pigure QX €—thw—y engineering will ˜ e™ome fe—si˜le on™e d—t— —nd knowledge —re prop erly ™omputerE ized —nd new ™omput—tion—l metho ds —re developedF