Dagstuhl-Seminar-Report; 46 07.09.-11.09.92 (9237)
Total Page:16
File Type:pdf, Size:1020Kb
Thomas Lengauer, Dietmar Schomburg, Michael S. Waterman (editors): Molecular Bioinformatics Dagstuhl-Seminar-Report; 46 07.09.-11.09.92 (9237) ISSN 0940-1121 Copyright© 1992 by IBFI GmbH,Schloß Dagstuhl, W-6648 Wadern, Germany Tel.: +49-6871 - 2458 Fax: +49-6871 - 5942 Das InternationaleBegegnungs- und Forschungszentrumfür lnformatik(IBFI) ist eine gemein- nützige GmbH. Sie veranstaltet regelmäßigwissenschaftliche Seminare, welche nach Antrag der Tagungsleiterund Begutachtungdurch das wissenschaftlicheDirektorium mit persönlich eingeladenen Gästen durchgeführt werden. Verantwortlich für das Programm: Prof. Dr.-Ing. Jose Encamagao, Prof. Dr. Winfried Görke, Prof. Dr. Theo Härder, Dr. Michael Laska, Prof. Dr. Thomas Lengauer, Prof. Ph. D. Walter Tichy, Prof. Dr. Reinhard Wilhelm (wissenschaftlicherDirektor) Gesellschafter: Universität des Saarlandes, Universität Kaiserslautern, Universität Karlsruhe, Gesellschaft für Informatik e.V., Bonn Träger: Die Bundesländer Saarland und Rheinland-Pfalz Bezugsadresse: Geschäftsstelle Schloß Dagstuhl Informatik, Bau 36 Universität des Saarlandes W - 6600 Saarbrücken Germany Tel.: +49 -681 - 302 4396 Fax: +49 -681 - 302 - 4397 e-mail: [email protected] Molecular Bioinformatics Thomas Le11gauer,Dietlnar Schomburg. 1\=Ii(:l1aelVVato1'111an (editors) Dagstuhl-~Semi11ar-Report, S(pt.emb<-I7 - 11 1992 (9237) Workshop on Molecular Bioinformatics Organizers: Thomas Lengauer (GMD, Schloss Birlinghoven/University of Bonn), Dietmar Schomburg (GBF, Braunschweig), Michael S. Waterman (USC, Los Angeles) September 7 - 11, 1992 Molecular Bioinformatics is a notion that. one can a.ssignto a.n area.in applied com- puter science which is rapidly gaining significance. Roughly, this area is conc'ernecl with the development of methods and tools for analyzing, nn(lerst.an('ling,reasoning about and, eventually, designing large biomole('ul<-~sauch as DNA. R NA, and proteins with t.he aid of the computer. With the knowledge in molecula.r biology increasing at an explosive rate, and data on genomes and their products being (.'ollect.e(.lat tremendous speeds. :V'lolecular Bioinformatics becomes an important challenge to applied computer scientist.s. Before this background, the Dagstuhl Seminar on Molecular Bioinformatics brought together experts from all over the world t.hat are working on algorit.hn1ic issues in this eld. The Workshop was interdiscplina.ry, with people frorn mole<'.u lar biology, computer science, and applied 1cnathe1naticsa.ttending. The worksliop focussed on the following topics: 0 Alignment of biomolecular sequences(DNA, RNA, Proteins), o Modeling of large biomolecules, including the prediction a11danalysis of secon- dary and higher-levelstructure as well as spatial conformations(folding), 0 Molecular dynamics and simulations of interactions between biomolecules, o Interpreting nucleotide sequencesand their role in gene regulation, 0 Reading genomic sequences. Besidesthe presentations,there were two organizedevening discussion sessions on the topics PAM matrices and Computer-AidedDrug Design. The experimentof bringing togetherresearchers with wide]y varying bacl~:g'roun(ls to discuss an exciting new interdisciplinary field was successful. The attendees dis- cussedlively and often controversially,developed a senseof identity for the new field during the workshop, and went back home with new insights, problemsand ideas. For the German researchers,the workshop was an ideal preparation for forming cooperationswithin the new funding program Molecular Bioinformatics that had just beenannounced by the BM FT (GermanMinistery for Researchand Technology).A few participants evaluatedthe workshopas their most productive workshop experience. We are especially grateful to the Dagstuhl office and team for their excellent organization in preparing and conducting the workshop, as well as their always Z enga.ge(land personalsupport.» in all matters. The cordial a.t.rnosphe~rein the .S'(fhIol$ was an essential ingredient of the Sll(('(_'5sof this workshop. We are also grateful to NSF for providing a grant for nancial support. of int..ercontinent.altravel of the (7.8. based participants of this workshop. Program Lengauer Morning Session Apostolico hdanber Chang Giegerich Naor AfternoonSession WatermanMonday, September 7 Welcome and Introductory Remarks Sequence Alignment VingronChair: Ga.d Landau Structuring Sequence Data Banks for lnstantant-éousPa.ra.ll<-I Search AapproxinialeieString Ma.t<liir1g:Applications Blum.»\ppro.\'i1na.l'e Mal cliing with (i'(msta.ntliraclion l:Crror linaihedclingSequence Analysis in a Functional Programming Environ ment hdanberRepresenting Suboptimal Aligments Chair: Gaston Gonnet Parametric SequenceAlignment Interpreta.tion of Pa1'a.1net1'icAlignment Plots On Locally Optimal Alignments in Genetic Sequences Approximate String Matching: Algorithms Morning Session Karp Tuesday, September 8 Shamir Physical Mapping Cl1a.ir: Gene Lawler Chang The Physica.l Mapping Problem Physical Mapping of DNA and lnterval G1°aplis Physical Mapping in Practice Gonnet Sequences and Trees Warn ow Towards the Best PossibleTheory a.nd Algorithms for Se- quence Alignment New problems in l_'3volut.ionaryTree (Tonstruction Afternoon Session Secondary and Tertiary Structure Schomburg Chair: Tim Havel Sander Computer-Ai(le<lDesign of Proteins With New Properties Selbig Morning Session Havel Crippen Chan Koch Wednesday, September 9 Tertiary Structure Miyano ()7hair: (.i.'el1risSander Distance (_ileomct1'_yand Homology Modeling Prediction of the 'le1'tiaryStructure of Globular Proteins over l)is-- crete (,i01lii()llIlEl.il01lil.lSpaces Morning Session A Lattice l£nuri1c*rationApproach to Protein Folding I)ress Secondary Structure Graph Theo1°etic.alDescription of Sheet Topologies Sankoff Machine Discovery by Decision 'l1'eesover Regular Pat1.crns von Haeseler Gonnet Thursday, September 10 Sequences Afternoon Session I)ress Chair: Udi l\lanber ls Darwinism a Falsiiiable Tlieory? Methods in Sequeiice .='\nalysis Schmidt Climbing a Tree Tlirough the Window Reconstructing Phylogenetic Trees Zuker Text Searching Algorithms in Darwin Chair: Temple Smith Statistical Geometry in SequenceSpace TaylorSecondary Structure Problems in Coiledcoil Proteins RNA Secondary Structure Modeling Molecular Graphics EveningSessionDelegate Analysis from a Macromolecular Graphics l1rt.e1fa.ce:Ex amples from SequenceAnalysis Problem Areas for Interaction BetweenBiochemistry and Compu- Havel ter Science NMR Spectroscopyfor Determination of Protein Structure Crippen Methods of Drug Design Friday, September 11 Tertiary Structure Morning Session Chair: Gordon Crippen Schulze-Kremer Applications of Articial Intelligenceand MachineLearning to Pro- tein Structure Analysis Sippl Hide and Seek on a Polyprotein Kaden Double-Point Chains in Proteins and InverseProtein Folding Schneider Sequence-StructureRelationships in the Twighlight Zone Structuring Sequence Data Banks for Instantaneous Parallel Searching Alberto Apostolico, University of Padua Current molecular sequence data. ba.nksconsist mainly of raw sequenceswith some annotation. While such a basic information will hardly be forfait.ed ever in the future, auxiliary data structures are being gradually introduced and studied which facilitate various kinds of searchesand ('omparisons.For some such manipulations. serial computation is inadequate,so that efficient parallel methods are sought. We present a data structure that supports constant-time implementation on a CRCVV PRAM of exact searchesfor arbitrary patterns into arbitrary substringsof a sequence data. bank. On the Accurate Notion of Locally Optimal Alignments and Subalignments in Genetic Sequences Norbert Blum, University of Bonn We review old and new results with respectto locally optimal alignmentsand sub- alignments in genetic sequences.The main theorem is that the subgraphof the edit. graph, containing exactly the locally optimal subalignments,can be computed in ('9(nmlog(_n+ m)) time, in the casethat the underlying cost functions are concave. A Lattice Enumeration Approach to Protein Folding Hue Sun Chan, University of California at San Francisco (UCSF) To study the protein folding problem, we use exhaustive sequenceand conforma- tional enumerationsto study copolymerchains configuredon lattices. These model moleculesare short self-avoidingchains of hydrophobic (H) and polar (P) mono- mers. This simple model showsthat under folding conditions, a signicant fraction of H/ P copolymersexhibit protein-like behavior suchas high compactness,conside- rable amount of secondarystructure, and low degeneracyof the lowestenergy state. We also explore the folding kinetics of those H/ P copolymerswhich have unique native structures. Under folding conditions, thesemodel protein moleculescollapse quickly to an ensembleof relatively compact conformations,and then re-arrange much more slowly as they seektheir unique native states. Folding time of the model moleculesis strongly sequence-dependent,because the arrangementof Hs and Ps along the sequencedetermines the energeticlandscape of the chains conformatio- nal space. The fastest folding sequencesare those whose native structures are most accessible and least protected by energy barriers. (This is joint work with Ken A. Dill at UCSF) Physical Mapping in Practice William I. Chang, Cold Spring Harbor Laboratory, N.Y. USA (lose collaboration l)et.ween biologists (D. lieacli Lab) and inatl1en'1atica.ls(ic11l;isl.s