From Proteome to Interactome and Beyond Adrien S.J

Overview Next challenges in protein–protein docking: from proteome to interactome and beyond Adrien S.J. Melquiond, Ezgi Karaca, Panagiotis L. Kastritis and Alexandre M.J.J. Bonvin∗ Advances in biophysics and biochemistry have pushed back the limits for the structural characterization of biomolecular assemblies. Large efforts have been devoted to increase both resolution and accuracy of the methods, probe into the smallest biomolecules as well as the largest macromolecular machineries, unveil transient complexes along with dynamic interaction processes, and, lately, dis- sect whole organism interactomes using high-throughput strategies. However, the atomic description of such interactions, rarely reached by large-scale projects in structural biology, remains indispensable to fully understand the subtleties of the recognition process, measure the impact of a mutation or predict the effect of a drug binding to a complex. Mixing even a limited amount of experimental and/or bioinformatic data with modeling methods, such as macromolecular docking, presents a valuable strategy to predict the three-dimensional structures of complexes. Recent developments indicate that the docking community is seething to tackle the greatest challenge of adding the structural dimension to interactomes. C 2011 John Wiley & Sons, Ltd. How to cite this article: WIREs Comput Mol Sci 2012, 2: 642–651 doi: 10.1002/wcms.91 INTRODUCTION former, molecular docking is already drifting toward nteractomes are huge, intricate, and highly dy- large-scale structure prediction. I namic molecular networks that determine the fate At the beginning of this century, structural ge- of the cell. They rely on thousands of protein com- nomics initiatives have opened the door to high- plexes that form the executive machinery underlying throughput and large-scale three-dimensional (3D) biological processes, from DNA replication to pro- structural determination of biomolecules. Along with tein degradation through metabolism. To catalyze the thriving pace of traditional structural biology, a wide diversity of chemical processes, proteins get this is resulting in tens of thousands of new struc- in touch with other proteins, nucleic acids, sugars, tures that are likely to feed upcoming projects aiming lipids, and various other molecules. In today’s postge- at adding the structural dimension to interactomes. nomic era, this versatility has been uncovered by stud- As a consequence, structural biology is evolving to- ies that have revealed the number and composition ward systems biology to gain a better understand- of these macromolecular assemblies, whereas others ing of how biomolecular units recognize each other and interact, modulate each other’s properties and/or have illuminated the structural and functional aspects 1 of their interactions. The quest for the elucidation of location and work together to fulfill their tasks. such networks is usually followed by the thirst of Information from either experimentally determined understanding, and lastly by the fantasy of predict- structures or comparative modeling has recently been ing. Although systems biology is still coping with the used to support the structural anatomy of a genome- reduced bacterium2; complemented by single-particle ∗Correspondence to: [email protected] electron microscopy and electron tomography, this is Bijvoet Center for Biomolecular Research, Faculty of Science, providing 3D details on the proteome organization. Utrecht University, Utrecht, The Netherlands This outstanding example of the structural mapping DOI: 10.1002/wcms.91 of a cell relies on structural templates to construct 642 c 2011 John Wiley & Sons, Ltd. Volume 2, July/August 2012 WIREs Computational Molecular Science Challenges in protein–protein docking FIGURE 1| Predicting three-dimensional (3D) interactomes by docking. Starting from experimental structures or models of all (or a subset of expressed proteins) in an organism, cross-docking followed by binding affinity prediction would ideally allow to predict an interactome, adding at the same time the structural dimension to it. models of binary interactions. Unfortunately, there is been developed for large-scale detection of protein– still a large gap between the number of complexes protein interactions (PPIs) such as yeast-two-hybrid identified by large-scale proteomics efforts and those screening, protein microarrays, or the split-ubiquitin for which high-resolution 3D experimental structures method. Among these approaches, mass spectrome- are available. At the same time, methods such as cryo- try (MS) techniques that determine the composition, electron microscopy (cryo-EM) or Small angle X-ray stoichiometry, and subunit organization of macro- scattering (SAXS) are generating a wealth of lower molecular complexes are of indispensable value.3 De- resolution structural information that can be used to termination of whole organism interactomes from complement the prediction efforts. Therefore, the ul- high-throughput protein interaction approaches is, timate tour de force would be to combine atomic however, still nontrivial and far from complete and structures or models of individual subunits with low- could, therefore, be well complemented by computa- resolution data and a mixed bag of other experimen- tional means. tal or bioinformatics information to add the structural dimension to interactomes (Figure 1). This also means tackling the challenges of predicting large conforma- Why Predicting Interactomes using 3D tional changes potentially occurring upon binding, Structures of Biomolecules? dealing with heterogeneous multicomponent assem- High-throughput and large-scale screening techniques blies, and predicting binding affinity. These are all re- are used to characterize PPIs in vivo. Despite the mas- quired in order to be able to model whole complexes sive number of interactions detected by these pro- or systems of the utmost importance in biology. The teomic techniques, interactome coverage remains low, molecular docking community is no longer waiting roughly 50% and 10% for the yeast and human in- and is already stepping into those challenges as will teractomes, respectively.4–6 The proteome coverage be further described in the following. is typically limited to about 70% for the best approaches and the inherent fraction of false positives UNCOVERING PROTEIN–PROTEIN certainly hampers a better coverage of PPIs. More- INTERACTIONS over, some methods have difficulties with specific types of interactions, which spur the use of comple- One of the most obvious challenges in the protein mentary techniques. Finally, PPI proteomic datasets interaction world is simply to identify ‘who inter- have often a very limited overlap (roughly 20%), even acts with whom’. High-throughput methods have when similar detection methods are used.7 Volume 2, July/August 2012 c 2011 John Wiley & Sons, Ltd. 643 Overview wires.wiley.com/wcms Beyond their significant contribution to unravel BOX 1: THE CRITICAL ASSESSMENT PPI networks, most experimental methods focus on OF PREDICTED INTERACTIONS EXPERI- mapping the network topology, i.e., drawing binary MENT interactions between dots. Even if this representation 13 of the ‘proteins at work’ is crucial to characterize CAPRI is a ‘blind’ docking experiment in which partic- the actors of the biological processes, it does not ipants have a limited time to predict the structure of a depict the biophysical properties of the underlying complex given only the structures, sometimes even only biomolecular complexes, such as the dynamics of the the sequences, of its free constituents. A recent assessment interactions, which distinguishes between what is ob- shows that docking algorithms efficiently and accurately ligate and what is transient, neither does it answer the predict complexes that undergo small-to-medium-size con- 14 question whether the interactions occur in cascade or formational changes, although they still encounter major simultaneously. For macromolecular assemblies, for difficulties in case of large conformational changes. Please example, PPI networks typically provide little infor- refer to Refs 15 and 16 for recent reviews about molecular mation about who contacts whom within a complex docking software, and Ref 17 for the current challenges and how mutations might affect function.8 To start and limitations highlighted by the CAPRI experiment. addressing such questions, typically more classical and time-consuming (structural) biology approaches • The scoring function should be based on rel- must be followed. evant physicochemical properties or princi- Computational methods to predict protein as- ples of the system that have been determined semblies may therefore play a role to analyze inter- to play a major role in the energetics of actomes, providing additional insights with leverage 12 , binding. of the structural models.8 9 Approaches using the • knowledge of the 3D structure are indeed generally These physicochemical properties should be more accurate than those based on sequence only.10 combined in a simple and fast model in or- They are consequently better suited to tackle the is- der to reproduce accurately, within the exper- sues mentioned above. These ‘new players’ aim at pre- imental error, a large number of experimen- dicting how the proteome is wired and how dynamic tally determined high-quality binding affinity changes in the interactome can be induced by envi- data. ronmental factors. However, until methods become • The algorithm should perform well

From Proteome to Interactome and Beyond Adrien S.J

Induction of Tumor Apoptosis Through a Circular RNA Enhancing Foxo3 Activity

© Copyright 2012 Craig J. Bierle Portions of This Dissertation Were

Computational Modeling of Rna-Small Molecule and Rna-Protein Interactions

A Uracil Transport Metabolon in E. Coli: Interaction Between the Membrane Transporter Uraa and the Cytosolic Enzyme UPRT

Assessing HADDOCK's Protein-Ligand Ensemble Docking Capabilities Through Urokinase Inhibitors

Investigation of RNA Editing Sites Within Bound Regions of RNA-Binding Proteins

Protein-Protein Docking with HADDOCK

Evolution of in Silico Strategies for Protein-Protein Interaction Drug Discovery

Molecular Docking Studies – a Review

Macromolecular Docking Simulation to Identify Binding Site of FGB1 for Antifungal Compounds

Methods and Applications of in Silico Aptamer Design and Modeling

Genome-Wide Identification and 3D Modeling of Proteins Involved in DNA Damage Recognition and Repair (Final Report)