AFP / Biosapiens 2008 SIG Meeting Program Toronto, Canada, July 2008
Total Page:16
File Type:pdf, Size:1020Kb
AFP / Biosapiens 2008 SIG Meeting Program Toronto, Canada, July 2008 Dear AFP / Biosapiens 2008 attendee, Welcome to the joint Automated Function Prediction / Biosapiens meeting in ISMB 2008. This is the second time AFP and Biosapiens are joining forces to bring you an engaging and stimulating program. As usual, we strive to bring you the latest in cutting-edge research in computational gene and protein function prediction and annotation delivered by leading international researchers. This year we are holding a joint session with the 3D SIG on predicting function from protein structure. “Structure to Function” is a problem that is rapidly moving to the forefront of life science due to the increasing number of unannotated structures coming from structural genomics projects. We are also addressing a need voiced by the community to learn more about computational function prediction. Kimmen Sjölander from the University of California Berkeley will conduct a workshop on function prediction using phylogenomics. Yanay Ofran from Bar Ilan University, Israel and Predrag Radivojac from Indiana University will present a variety of components, tools and techniques for computational function prediction. Those tutorials are intended for researchers and students that are entering the field, and for those in the field to gain more knowledge and expertise. This is a rare opportunity to interact informally with leading experts in the field and enhance your own research. We would like to thank the members of the Program Committee for carefully reviewing the abstracts sent to this meeting. Thanks to the International Society for Computational Biology for hosting this meeting at ISMB 2008. A special thanks to Prof. Sean Mooney and his lab at the Indiana University School of Medicine for maintaining the conference website. Iddo Friedberg (Chair), Michal Linial (Co-chair), Adam Godzik and Ying Zhang Organizing committee: Iddo Friedberg, University of California San Diego, USA Michal Linial, The Hebrew University of Jerusalem, Israel Ying Zhang, Burnham Institute for Medical Research, La Jolla CA, USA Program Committee: Iddo Friedberg (Chair) Sarah Boyd Priti Talwar Jeffrey Chang Mallika Veeramalai Barry Grant Mark Wass Thomas Hamelryck Daniela Wieser Piotr Kozbial Shirley Wu Michal Linial Yuzhen Ye Marc Marti-Renom Ying Zhang Mary Pacold Ana Rodrigues Ariel Schwartz AFP / Biosapiens 2008 Conference Program Time Speaker Title 8:30-8:45 Opening remarks Day 1 - Friday, July 18, 2008 University of California, 8:45-9:30 Patricia Babbitt San Francisco TBA Los Alamos National Prediction of Functional Sites in SCOP Domains using 9:30-9:50 Judith Cohn Laboratory Dynamics Perturbation Analysis Partial Order Optimal Likelihood (POOL): A New Approach 9:50-10:15 Mary Jo Ondrechen Northeastern University to High Performance Functional Site Prediction 10:15-10:35 Break Howard Hughes Medical 10:35-11:15 Barry Honig Institute & Columbia On the nature of protein fold space: extracting functional Joint Session with University information from apparently remote structural neighbors 3D Sig Assessing functional novelty of PSI structures via structure- Joint Session with 11:15-11:35 Benoît H Dessailly University College London function analysis of large and diverse superfamilies 3D Sig Joint Session with 11:35-11:55 TBD TBD TBD 3D Sig Prediction of functional characteristics based on sequence and Joint Session with 11:55-12:35 Alfonso Valencia CNIO, Spain structure 3D Sig 12:35-13:30 Lunch University of California, A Systematic Approach to Identifying Protein-Ligand Binding 13:30-14:15 Philip Bourne San Diego Profiles on a Proteome Scale LabelHash: A Flexible and Extensible Method for Matching 14:15-14:35 Mark Moll Rice University Structural Motifs Analysis of Genetic Interaction Maps Reveals Functional 14:35-15:00 Shuye Pyu University of Toronto Pleiotropy Poster Session (Coffee 15:00-17:00 break 15:30-16:00) Research Laboratories of 17:00-17:20 Philip Groth Bayer Schering Pharma Hunting for gene function: Using phenotype data mining as a AG, Berlin, Germany largescale discovery tool Centre de Regualció 17:20-18:00 Roderic Guigo Genòmica, Spain Transcriptional complexity in the human genome Day 2 - Saturday, July 19, 2008 8:30-8:40 Rally Investigating the biological role(s) of the functional orphan 8:40-9:20 Andrew Emili University of Toronto protein repertoire of Escherichia coli: Integrating experimental data with genomic inference to make testable predictions Workshops (Coffee break 9:30-12:00 10:15-10:45) 1) Kimmen Sjölander; 2)Yanay Ofran and Predrag Radivojac University of California, 12:00-12:25 Steven Brenner Berkeley Assessment of Molecular Function Prediction 12:30-13:30 Lunch The European Molecular 13:30-14:15 Peer Bork Biology Laboratory Predicting biological functions at different spatial scales ESG: Extended Similarity Group method for automated 14:15-14:35 Daisuke Kihara Purdue University protein function prediction The Hebrew University of 14:35-15:00 Michal Linial Jerusalem Safe Functional Inference for Uncharacterized Viral Proteins 15:05-15:30 David Horn Tel Aviv University Data mining of protein families using common peptides. 15:30-16:00 Coffee break Predicting Protein-Disease Relationships Using Sequence, 16:00-16:20 Predrag Radivojac Indiana University Physicochemical Properties, and Molecular Function Information University of Minnesota, Association Analysis Techniques for Discovering Functional 16:20-16:40 Gaurav Pandey Twin Cities Modules from Microarray Data Computationally-driven experimental identification of protein 16:40-17:20 Olga Troyanskaya Princeton University function 17:20-17:30 Closing remarks Talk Abstracts Conserved Substrate Substructures & Protein Similarity Networks for Automated Annotation of Enzymes Patricia Babbitt*, Ranyee Chiang, Shoshana Brown, Holly Atkinson Univerisity of California, 1700 4th St., MC 2550, San Francisco, CA 94158-2330, USA *Correspondence: [email protected] 1. INTRODUCTION While many methods are available for inference of functional properties for uncharacterized proteins, homology-based analysis remains a major approach for assignment of molecular function by annotation transfer. For enzymes, organization of homologous proteins into superfamilies related by structural similarities in the catalytic machinery required to catalyze a fundamental aspect of the chemical reactions represented is a powerful way to assign functional characteristics to newly discovered members. Yet many such superfamilies include multiple reactions evolved to catalyze different overall reactions using broadly dissimilar substrates, challenging the effectiveness of annotation transfer between diverse sequences and structures, particularly for automated function prediction. To address some of these issues, we have developed two new approaches to aid in functional annotation of members of enzyme superfamilies. The first, identification of conserved substrate substructures associated with common aspects of catalysis across all diverse members of a superfamily allows annotation using information that is orthogonal to that obtained from sequence and structural conservation patterns. The second approach uses protein similarity networks to facilitate annotation via mapping of functional properties to superfamily member proteins that have been clustered using simple sequence similarity metrics. 2. RESULTS 2.1 Conservation of substrate substructures for automated annotation of functional characteristics of enzyme superfamilies. Just as conservation patterns across sequences and structures of homologous proteins can reveal clues about their functions, conservation patterns across their cognate substrates identify aspects of function associated with all member reactions, even when the substrates and overall reactions across a superfamily vary greatly. We analyzed graph isomorphisms among enzyme substrates for 42 SCOP superfamilies to establish the substrate conservation pattern for each. The chemical substructures conserved among all known substrates of each superfamily, the substructures that are reacting, and the relationship between the two were determined (1). This information can be used to annotate new sequences and structures identified as members of these superfamilies (Fig. 1). The results define an obligate substructure for the substrate of uncharacterized members, thereby restricting the reaction space to be explored for annotation of reaction specificity or to identify featues of ligands useful for screening or co-crystallization Fig. 1. Examples of uncharacterized structures from the PSI that can be annotated with the substrate substructure conserved in all members of the superfamily to which it belongs. studies. The method is automated, enabling large-scale identification of fundamental functional capabilities of new superfamilies provided sufficient diverse members have already been functionally characterized. 2.2 Protein similarity networks for facile mapping of functional properties to sequence clusters in protein superfamilies. To improve annotation of specific molecular function on a large scale, we have explored and validated the use of protein similarity networks (2) for visualization of functional trends across large and diverse protein superfamilies. Using pairwise comparisons of large sets of homologous sequences in superfamilies representing many different molecular functions, we show that even such simple approaches to clustering sequences in networks provide satisfactory depictions of high-dimensional similarity