Titelei 1..30
Total Page:16
File Type:pdf, Size:1020Kb
Johann Gasteiger, Thomas Engel (Eds.) Chemoinformatics Chemoinformatics: A Textbook. Edited by Johann Gasteiger and Thomas Engel Copyright 2003 Wiley-VCH Verlag GmbH & Co. KGaA. ISBN: 3-527-30681-1 Related Titles from WILEY-VCH Hans-Dieter HoÈltje, Wolfgang Sippl, Didier Rognan, Gerd Folkers Molecular Modeling 2003 ISBN 3-527-30589-0 Helmut GuÈnzler, Hans-Ulrich Gremlich IR Spectroscopy An Introduction 2002 ISBN 3-527-28896-1 Jure Zupan, Johann Gasteiger Neural Networks in Chemistry and Drug Design An Introduction 1999 ISBN 3-527-29778-2 (HC) ISBN 3-527-29779-0 (SC) Siegmar Braun, Hans-Otto Kalinowski, Stefan Berger 150 and More Basic NMR Experiments A Practical Course 1998 ISBN 3-527-29512-7 Johann Gasteiger, Thomas Engel (Eds.) Chemoinformatics A Textbook Editors: This book was carefully produced. Never- theless, editors, authors and publisher do Prof. Dr. Johann Gasteiger not warrant the information contained Computer-Chemie-Centrum and Institute therein to be free of errors. Readers are ofOrganic Chemistry advised to keep in mind that statements, University ofErlangen-NuÈrnberg data, illustrations, procedural details or NaÈgelsbachstraûe 25 other items may inadvertently be inaccurate. 91052 Erlangen Germany Library of Congress Card No.: applied for A catalogue record for this book is available from the British Library. Dr. Thomas Engel Computer-Chemie-Centrum and Institute Bibliographic information published by ofOrganic Chemistry Die Deutsche Bibliothek University ofErlangen-NuÈrnberg Die Deutsche Bibliothek lists this publica- NaÈgelsbachstraûe 25 tion in the Deutsche Nationalbibliografie; 91052 Erlangen detailed bibliographic data is available in the Germany Internet at http://dnb.ddb.de ISBN 3-527-30681-1 c 2003 WILEY-VCH Verlag GmbH & Co. KgaA, Weinheim Printed on acid-free paper. All rights reserved (including those of translation in other languages). No part of this book may be reproduced in any form ± by photoprinting, microfilm, or any other means ± nor transmitted or translated into machine language without written permis- sion from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law. Composition Hagedorn Kommunikation GmbH, Viernheim Printing betz-druck gmbH, Darmstadt Bookbinding Litger & DopfBuchbinderei GmbH, Heppenheim Printed in the Federal Republic ofGermany. V J. Gasteiger To Uli, Nina, Julia and Michael T. Engel: To my family and Gabi VI Ifyou want to build a ship, don't drum up people to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea. Antoine de Saint-Exupery Foreword VII Foreword Mathematically speaking, two chicken halves give one chicken. This is true with respect to mass. Such a chicken, however, is not a living one. Moreover, the two halves must be mirror images to form the shape of a chicken. Obviously, numbers are only halfofthe story, informationcomes in disparate guises. Chemistry, the science ofmaterials and their transformations,exhibits a broad diversity ofinfor- mation, which by now encompasses an enormous body ofknowledge about chemical structures, properties, and reactions. However, despite all the achieve- ments during the last two centuries, which changed early chemical craftsmanship into a sophisticated natural science, chemistry is still devoid ofthe all-round theory, having, for examples the potential to predict precise structure-activity or structure- function relationships. It would explain, for instance, why palytoxin C129H223N3O54, being isolated from a Hawaiian coral in 1979, is one of the most poisonous natural substances. With its 64 chiral carbon atoms and six olefinic bonds, offering more than 1021 possible isomers, the total synthesis ofpalytoxin carboxylic acid may be compared with the first climbing of Mount Everest (Kishi, et al. at Harvard Univer- sity, 1989). Comparable efforts are needed to master the flood of information and accumu- lated knowledge in chemistry today. While until 1960 the number ofnatural and laboratory-produced compounds had almost linearly increased to roughly one mil- lion in about 150 years, its growth expanded exponentially from then on, reaching 18 million in 2000. This is just one aspect ofthe revolution in chemistry brought about by the rapid advancement ofcomputer technology since 1965. Methods of physics, mathematics and information science entered chemistry to an unprece- dented extent, which furnished laboratories with powerful new instrumental tech- niques. Also, a broad variety ofmodel-based or quantum-mechanical computations became feasible, which were thought impossible a few decades ago. For example, the computer modeling ofwater transport through membranes mediated by aqua- porins yields the time dependence ofthe spatial position oftypically 10 5 atoms on a picosecond scale up to 10 ns (GrubmuÈller et al., MPI GoÈttingen). Such huge data arrays can be searched for and accessed via computer networks and then evaluated in a different context (¹data miningª). Furthermore, new chemical techniques, such as combinatorial synthesis, have high data output. Overall it can be stated that, particularly for the chemical and pharmaceutical industries, researchers VIII Foreword now spend more time in digesting data than in generating them, whereas the reverse was true a few years ago. In the 1970's, chemists increasingly encountered varying aspects ofthe trium- virate ªchemistry-information-computerº (CIC) while conducting their research. Common to all was the use of computers and information technologies for the generation ofdata, the mixing ofdata sources, the transformationofdata into information and then information into knowledge for the ultimate purpose of solving chemical problems, e.g. organic synthesis planning, drug design, and structure elucidation. These activities led to a new field of chemical expertise which had distinctly different features compared with the traditional archiving approach ofchemical information,which has been established about 200 years ago and comprises primary journals, secondary literature, and retrieval systems like Chemical Abstracts. In the 1980s, computer networks evolved and opened a new era for fast data flow over almost any distance. Their importance was not generally recognized in the chemical community at the beginning. The situation may be characterized with words from the late Karl Valentin: ªA computer network is something that one does not want to be in the need to have, nevertheless simply must want to have, because one always might be in need to use it. (Ein Datennetz ist etwas was man eigentlich nie brauchen muÈssen moÈchte, aber doch einfach wollen muû, weil man es immer brauchen tun koÈnnte.)ª In 1986 Johann Gasteiger, also in Munich, coinitiated the Task Force CIC ofthe German Chemical Society (GDCh). In the same year, he started the CIC Workshops on Software Develop- ment in Chemistry which found overwhelming acceptance. Until today these an- nual meetings have served as a forum for the presentation and dissemination of recent results in the various CIC fields, including chemical information systems. The Task Force CIC later merged with GDCh Division Chemical Information under the name ªChemie-Information-Computer (CIC)ª. The 1990s saw the advent ofthe Internet, which boosted the use ofcomputer networks in chemistry. In its sequel, at the turn ofthe century, the work ofthe fore- runners at the intersection ofchemistry and computer science eventually received recognition in its entirety as a new interdisciplinary science: ªChemoinformaticsª had come ofage. It encompasses the design, creation, organization, management, retrieval, analysis, dissemination, visualization, and use ofchemical information (G. Paris). Its cousin, Bioinformatics, which was developed somewhat earlier, generally focuses on genes and proteins, while chemoinformatics centers on small molecules. Yet the distinction is fuzzy, e.g. when the binding of small mole- cules to proteins is addressed. From the viewpoint ofthe lifesciences, the border- line may blur completely. A young scientific discipline grows with its students. For students, in turn, the efforts pay dividends. The demand from industrial employers increases steadily for chemoinformaticians, and so the field is expected to become big business. The question therefore arises how and where chemoinformatics can be learned. One good place to go is Erlangen. Johann Gasteiger and his group were practicing chemoinformatics for 25 years without even knowing its name. In 1991 he received Foreword IX the Gmelin Beilstein medal ofthe German Chemical Society (GDCh) and in 1997, he received the ªHerman Skolnik Awardª ofthe Division ofChemical Information ofthe American Chemical Society in recognition forhis many achievements in the CIC field. With the present comprehensive textbook on chemoinformatics for undergraduate and graduate students, Gasteiger and his group lay a solid founda- tion-stone for many more ªErlangensª in the world. My warm recommendation goes with this book, in particular to my academic colleagues. It seems to be the right time for universities to begin to teach chemoinformatics on a broad scale. Subject to local reality this may be conducted as part ofa diploma course of study in chemistry, as a master's study after a BS in chemistry, or as a full course of study in chemoinformatics, like that available in bioinformatics. In all cases, this book and its supplementary material