In Silico Genomes

© 2000 Nature America Inc. • http://biotech.nature.com RESOURCES BOOKS In silico genomes Tim Littlejohn Post-genome Informatics transmembrane segments and three-dimensional structures; neural networks; hidden Minoru Kanehisa Markov models; formal grammars; gene 2000 Oxford University Press, finding; functional predictions; and expert 148 pages, $70 hardcover systems for protein sorting prediction. Kanehisa ends as strongly as he begins These are heady days for geneticists. The first with the final chapter on network analysis of leg of the human genome race has finished, molecular interactions. Molecular networks and a “draft” sequence of the human genome are described as being composed of various is now out at a database near you. However, configurations of binary relations (e.g., in this data-rich world we are surprisingly molecular interactions) between elements information-poor, as the ability to generate (e.g., molecules and genes) in the cell. molecular genetic data is largely outstripping Molecular biology information is neatly dis- our ability to analyze it—computationally or sected into various networks, such as path- .com experimentally. ways (e.g., metabolic), assemblies (e.g., Minoru Kanehisa’s Post-genome multiprotein complexes), linear (e.g., Informatics is well timed to help biologists genomes), neighbors (e.g., those identified identify the grand challenges facing them in describes molecular biology databases that from sequence similarities), clusters (e.g., h.nature the post-genome era. And these challenges are both well established (sequence databas- multiple alignments), and trees (e.g., hierar- are indeed grand: no less than that of es such as GenBank and SWISS-PROT) and chy of gene function). He goes on to whole-organismal reconstruction and the emerging (biochemical pathway databases describe how these networks can be mathe- prediction of a protein’s three-dimensional such as KEGG). This provides an excellent matically represented and analyzed as http://biotec • structure from its amino acid sequence, primer on the basics of molecular biology graphs. The value of this higher-level repre- according to the author. Kanehisa laments data and how it is currently organized, so the sentation of biological information that the physicists and chemists have been data can be converted into information becomes clear when looking at the network able to predict matter and compounds from using bioinformatics. data representation in the KEGG database, elementary particles and elements for what Readers should expect to be sadly disap- which spans metabolic pathways, genome seems to be the longest time, yet biologists pointed if they hope to get a comprehensive and comparative genome maps, expression still “do not yet know whether the informa- review of the algorithms and tools for maps, orthologous genes and gene molecu- tion in the genome is sufficient to build an sequence analysis of nucleic acids and pro- lar taxonomy, and disease catalogs. entire living system.” The challenge to teins from the book’s single chapter on this Databases such as KEGG represent good genomics now is to turn to functional tools, subject. Although the problems in sequence repositories for biochemical networks that 2000 Nature America Inc. © most importantly bioinformatics (computa- analysis are elegantly explained, as are the can be drawn on to facilitate knowledge- tional biology), to unearth the fundamental reasons why certain computer science and based predictions. The organism recon- truths of living systems. mathematical approaches are more suited to struction and protein-folding problems Post-genome Informatics is not just a guide this discipline than others, the chapter is introduced in the first chapter are both to the post-genome perplexed, and certainly meant more as an overview than as a knowledge-based prediction problems, not a book for biologists looking for a “how- detailed analysis. The take-home message is requiring databases such as KEGG for bio- to” guide to bioinformatics. Indeed, the clear, though: “computational biology. chemical networks and protein three- author deliberately steers well away from usually involves processing empirical dimensional reference databases for struc- detailed discussion of bioinformatics data- knowledge acquired from observed data ture prediction. bases and tools. Rather, the book takes a rather than solving first-principle equa- Post-genome Informatics is a “must- more theoretical approach, and in doing so, tions, which are virtually nonexistent in have” for all with a keen interest in bioin- lays a clear and concise groundwork in both biology.” Biologists should have a copy of formatics. It’s a compact book, with many molecular genetics and informatics that their favorite mathematics textbook on excellent illustrations and elegant perspec- underpin bioinformatics. It will be an asset to hand for this section, since a good ground- tives on the opportunities and challenges in bioinformatics researchers, students, and ing in math is needed to understand many the times ahead. The focus of the work educators alike. of the basic principles in similarity search- means that several important areas of Moving through the blueprint of life ing and prediction of structures and func- bioinformatics are left largely ignored. (aimed at computer scientists) and shifting tions. The material on similarity searching These include the informatics challenges in rapidly into an analysis of database tech- covers topics such as dynamic program- DNA arrays, proteomics, variation analysis nologies (aimed at biologists), Kanehisa ming; global, local, and multiple align- (mutation, population, phylogeny, biodi- ments; fastA, BLAST, and genetic algo- versity), and genetic mapping and QTL rithms; and phylogenetic analysis and simu- analysis. However, the book is a valuable Tim Littlejohn is chief scientific officer at lated annealing. The section on prediction contribution to the bioinformatics litera- eBioinformatics, Bay 16/104, Australian of structures and functions includes materi- ture and the discussions of molecular and Technology Park, Eveleigh, NSW 1430, al on prediction of RNA and protein sec- higher level networks, and it will be stimu- Australia ([email protected]). ondary structures; prediction of protein lating to all who traverse its pages. /// 902 NATURE BIOTECHNOLOGY VOL 18 AUGUST 2000 http://biotech.nature.com.

In Silico Genomes

A Method to Infer Changed Activity of Metabolic Function from Transcript Proﬁles

The Kyoto Encyclopedia of Genes and Genomes (KEGG)

3 13437143.Pdf

In Silico Tools for Splicing Defect Prediction: a Survey from the Viewpoint of End Users

Genbank by Walter B

Integrative Annotation of 21,037 Human Genes Validated by Full-Length Cdna Clones

In Silico Protein Design: a Combinatorial and Global Optimization Approach by John L

7 Systems Biotechnology: Combined in Silico and Omics Analyses for The

In Silico Prediction of Protein Flexibility with Local Structure Approach

Bioinformatics Glossary Based Database of Biological Databases: DBD

I S C B N E W S L E T T

UCLA UCLA Electronic Theses and Dissertations