GI-Edition Proceedings

Gesellschaft für Informatik e.V. (GI) GI-Edition publishes this series in order to make available to a broad public recent findings in informatics (i.e. computer science and informa- Lecture Notes tion systems), to document conferences that are organized in cooperation with GI and to publish the annual GI Award dissertation. in Informatics Broken down into • seminar • proceedings Dietmar Schomburg, • dissertations • thematics Andreas Grote (Eds.) current topics are dealt with from the vantage point of research and development, teaching and further training in theory and prac- tice. The Editorial Committee uses an intensive review process in order to ensure high quality contributions. German Conference on The volumes are published in German or English. Bioinformatics 2010 Information: http://www.gi-ev.de/service/publikationen/lni/ September 20 - 22, 2010 D. Schomburg, A. Grote (Eds.): GCB 2010 Grote (Eds.): A. Schomburg, D. Braunschweig, Germany ISSN 1617-5468 ISBN 978-3-88579-267-3 This volume contains papers presented at the 25th German Conference on Bioin- formatics held at the Technische Universität Carolo-Wilhelmina in Braunschweig, Germany, September 20-22, 2010. The German Conference on Bioinformatics is 173 an annual, international conference, which provides a forum for the presentation Proceedings of current research in bioinformatics and computational biology. It is organized on behalf of the Special Interest Group on Informatics in Biology of the German Society of Computer Science (GI) and the German Society of Chemical Technique and Biotechnology (Dechema) in cooperation with the German Society for Bio- chemistry and Molecular Biology (GBM). Dietmar Schomburg, Andreas Grote (Editors) German Conference on Bioinformatics 2010 September 20-22, 2010 Technische Universität Carolo Wilhelmina zu Braunschweig, Germany Gesellschaft für Informatik e.V. (GI) Lecture Notes in Informatics (LNI) - Proceedings Series of the Gesellschaft für Informatik (GI) Volume P-173 ISBN 978-3-88579-267-3 ISSN 1617-5468 Volume Editors Prof. Dr. Dietmar Schomburg Technische Universität Carolo-Wilhelmina zu Braunschweig Email: [email protected] Dr. Andreas Grote Technische Universität Carolo-Wilhelmina zu Braunschweig Email: [email protected] Series Editorial Board Heinrich C. Mayr, Universität Klagenfurt, Austria (Chairman, [email protected]) Hinrich Bonin, Leuphana-Universität Lüneburg, Germany Dieter Fellner, Technische Universität Darmstadt, Germany Ulrich Flegel, SAP Research, Germany Ulrich Frank, Universität Duisburg-Essen, Germany Johann-Christoph Freytag, Humboldt-Universität Berlin, Germany Thomas Roth-Berghofer, DFKI Michael Goedicke, Universität Duisburg-Essen Ralf Hofestädt, Universität Bielefeld Michael Koch, Universität der Bundeswehr, München, Germany Axel Lehmann, Universität der Bundeswehr München, Germany Ernst W. Mayr, Technische Universität München, Germany Sigrid Schubert, Universität Siegen, Germany Martin Warnke, Leuphana-Universität Lüneburg, Germany Dissertations Dorothea Wagner, Universität Karlsruhe, Germany Seminars Reinhard Wilhelm, Universität des Saarlandes, Germany Thematics Andreas Oberweis, Universität Karlsruhe (TH) Gesellschaft für Informatik, Bonn 2010 printed by Köllen Druck+Verlag GmbH, Bonn Preface This volume contains papers presented at the 25th German Conference on Bioinformat- ics held at the Technische UniversitätCarolo-Wilhelmina in Braunschweig, Germany, September20-22, 2010. The German Conference on Bioinformatics is an annual, international conference, whichprovides aforum for the presentation of currentresearch in bioinformatics and computational biology.Itisorganized on behalf of the Special Interest Group on Informatics in Biology of the German SocietyofComputer Science (GI) and the German SocietyofChemical Technique and Biotechnology (Dechema) in cooperation with the German Societyfor Biochemistry and Molecular Biology (GBM). Fiveoutstanding scientists were invited to givekeynote lectures to the conference: • Edda Klipp -‘Cellular stress response and regulation of metabolism’ • Thomas Lengauer -‘HIV Bioinformatics: Analyzing viral evolution for the ben- efit of AIDS patients’ • Werner Mewes -‘The data deluge: can simple models explain complex biological systems?’ • Stefan Schuster -‘Road games in metabolism -Abiotechnological perspective’ • Gregory Stephanopoulos -‘After adecade of systems biology,time for arecord card’ Besides the keynotelectures, the scientific program comprised 22 contributed talks presenting 12 regular and 10 short papers. All accepted regular papers are collected in these proceedings. The remaining accepted contributions, i.e. short papers and poster abstracts, are published in aseparate volume. We would liketothank the program committee members and all reviewers for their evaluations of the contributions. Furthermore, we would like to thank the local organizers for keeping the conference running. The organizers are grateful to all the sponsors and supporting scientific part- ners. Last but not least, we would liketothank all contributors and participants of the GCB 2010. Braunschweig, August 2010 Dietmar Schomburg and Andreas Grote 5 Organizers Conference Chair Dietmar Schomburg, Braunschweig Local Organizers Wolf-Tilo Balke(TU Braunschweig) Andreas Grote (TU Braunschweig) Sándor Fekete (TU Braunschweig) Katharina Hanke(TU Braun- Reinhold Haux (TU Braunschweig) schweig) Dieter Jahn (TU Braunschweig) Adam Podstawka(TU Braun- Frank Klawonn (Ostfalia University schweig) of Applied Sciences) Alexander Riemer (TU Braun- Constantin Bannert (TU Braun- schweig) schweig) Maurice Scheer (TU Braunschweig) Antje Chang (TU Braunschweig) Programm committee Mario Albrecht,Saarbrü cken Michael Marschollek, Hannover Wolf-Tilo Balke, Braunschweig Werner Mewes, München Tim Beißbarth, Göttingen Michael Meyer-Hermann Frankfurt Thomas Dandekar, Würzburg Burkhard Morgenstern, Göttingen Sándor Fekete, Braunschweig Stefan Posch, Halle Georg Füllen, Rostock Matthias Rarey,Hamburg Robert Giegerich, Bielefeld Falk Schreiber, Gatersleben Reinhold Haux, Braunschweig Stefan Schuster, Jena Ralf Hofestädt,Bielefeld Gregory Stephanopoulos, Cam- Matthias Heinemann, Zürich bridge USA Dieter Jahn, Braunschweig Jens Stoye,Bielefeld Frank Klawonn, Wolfenbüttel Andrew Torda, Hamburg Edda Klipp, Berlin Martin Vingron, Berlin Ina Koch,Berlin Christian vonMering, Zürich Oliver Kohlbacher, Tübingen Edgar Wingender, Göttingen Thomas Lengauer, Saarbrü cken Andreas Ziegler, Lübeck Hans-Peter Lenhof, Saarbrü cken 6 Sponsors and supporters Supporting scientific societies Gesellschaft für Chemische Technik und Biotechnologie e.V. (DECHEMA) http://www.dechema.de Gesellschaft für Biochemie und Molekularbiologie e.V. (GBM) http://www.gbm-online.de Gesellschaft für Informatik e.V. (GI) http://www.gi-ev.de Non-profit sponsors Technische UniversitätBraunschweig http://www.tu-braunschweig.de Commercial sponsors Biobase -Biological Databases http://www.biobase-international.com CLC bio http://www.clcbio.com ConveyComputer http://www.conveycomputer.com genomatix http://www.genomatix.de 7 geneXplain http://www.genexplain.com MEGWARE Computer Cluster http://www.megware.com Thalia http://www.thalia.de Transtec -IT&Solutions http://www.transtec.de 8 Table of Contents Preface 5 Organizers 6 Sponsors and supporters 7 Table of Contents 9 Submissions 11 Andreas R. Gruber, Stephan H. Bernhart, YouZhou, IvoL.Ho- facker RNALfoldz: efficientprediction of thermodynamically stable, local secondary structures ...................................... 11 Florian Battke, Stephan Körner,Steffen Hüttner, KayNieselt Efficientsequence clustering for RNA-seq data without areference genome .21 Florian Blöchl, Maria L. Hartsperger, Volker Stü mpflen, Fabian J. Theis Uncoveringthe structure of heterogenous biological data: fuzzy graph parti- tioning in thek-partite setting .......................... 31 Sergiy Bogomolov, Martin Mann, Björn Voß, Andreas Podelski, Rolf Backofen Shape-based barrier estimation for RNAs .................... 41 Thomas Fober, Marco Mernberger, Gerhard Klebe, EykeHü llermeier EfficientSimilarityRetrieval of Protein Binding Sites based on Histogram Comparison ..................................... 51 Peter Husemann, Jens Stoye Repeat-aware ComparativeGenome Assembly .................. 61 Katrin Bohl, Lu´ı sF.deFigueiredo, Oliver Hädicke,Steffen Klamt, Christian Kost, Stefan Schuster, Christoph Kaleta CASOP GS: Computing intervention strategies targeted at production im- provementingenome-scale metabolic networks ................. 71 Jan Grau, Daniel Arend, IvoGrosse, Artemis G. Hatzigeorgiou, Jens Keilwagen, Manolis Maragkakis, Claus Weinholdt, Stefan Posch Predicting miRNA targets utilizing an extended profile HMM ......... 81 9 Arli A. Parikesit, Peter F. Stadler, Sonja J. Prohaska QuantitativeComparison of Genomic-Wide Protein Domain Distributions .. 93 Jan Budczies, Carsten Denkert, Berit M. Müller, Scarlet F. Brockmöller, Manfred Dietel, Jules L. Griffin, Matej Oresic, Oliver Fiehn METAtarget –extracting keyenzymes of metabolic regulation from high- throughput metabolomics data using KEGG REACTION information ....103 Andreas Gogol-Döring, WeiChen Finding Optimal Sets of Enriched Regions in ChIP-Seq Data .........113 Enrico Glaab, Jonathan M. Garibaldi, Natalio Krasnogor Learning pathway-based decision rules to classify microarraycancer samples .123 Index of authors 135 10 Gruber et al. 11 RNALfoldz: efficient prediction of thermodynamically stable, local secondary structures Andreas R.

GI-Edition Proceedings

The SUPERFAMILY 2.0 Database: a Signiﬁcant Proteome Update and a New Webserver Arun Prasad Pandurangan 1,*, Jonathan Stahlhacke2, Matt E

Tum1 Is Involved in the Metabolism of Sterol Esters in Saccharomyces Cerevisiae Katja Uršič1,4, Mojca Ogrizović1,Dušan Kordiš1, Klaus Natter2 and Uroš Petrovič1,3*

Smurflite: Combining Simplified Markov Random Fields With

Renaissance SUPERFAMILY in the Decade Since Its Launch, a Repository of Information About Proteins in Genomes Has Developed Into a Primary Reference

Specialized Hidden Markov Model Databases for Microbial Genomics

Dcgo: Database of Domain-Centric Ontologies on Functions, Phenotypes, Diseases and More Hai Fang* and Julian Gough*

Interpro: the Integrative Protein Signature Database

Article Reference

Assignment of Homology to Genome

Package 'Dcgor'

HMMER Web Server Documentation Release 1.0

Sequences and Topology: Disorder, Modularity, and Post/Pre Translation Modiﬁcation