<<

Structures in systems Pedro Beltrao1, Christina Kiel2 and Luis Serrano2

Oil and water do not normally mix, and apparently Understanding the properties of a by abstracting the structural biology and look like two functionally relevant behaviors from the underlying cel- different universes. It can be argued that structural lular components is the very objective of systems analysis. biology could play a very important role in systems biology. To study a cell, we should be able to work with as Although at the final stage of understanding a signal dots that are produced and degraded, that diffuse or move transduction pathway, a cell, an or a living system, with active transport, and that interact and/or change could be obviated, we need them to be able properties in a defined space. However, to reach this to reach that stage. Structures of , level of abstraction of , we must first especially molecular machines, could provide quantitative have a detailed understanding of protein and parameters, help to elucidate functional networks or dynamics. It is uncontroversial to state that without enable rational designed perturbation experiments for reverse structural biology this is not achievable. engineering. The role of structural biology in systems biology should be to provide enough understanding so that Depending on the specific biological question at hand, macromolecules can be translated into dots or even into different structural details and biophysical properties of equations devoid of . protein complexes should be explored to provide signifi- Addresses cant insight. For example, when part of a signal trans- 1 European Laboratory (EMBL), duction cascade is analyzed, accurate kinetic constants will Meyerhofstrasse 1, Heidelberg D69115, Germany be crucial to model a system correctly. As we will discuss 2 Centre de Regulacio Genomica (CRG), Dr Aiguader 88, 08003 Barcelona, Spain below, protein complex structures could be used to predict these kinetic constants in silico. In cases in which under- Corresponding author: Serrano, Luis ([email protected]) standing the spatial cellular distribution of larger protein complexes is the aim, the affinity or approximate kinetic constants might be enough. In such circumstances, qual- Current Opinion in Structural Biology 2007, 17:378–384 itative experimental binding information, as from pull- This review comes from a themed issue on down assays, can be combined with structural information Sequences and topology from electron microscopy and fluorescence imaging. Edited by William R Pearson and Anna Tramontano

Available online 15th June 2007 In the following, we will discuss how structural is being explored and the role it should increasingly play 0959-440X/$ – see front matter to reduce to their key functional properties # 2007 Elsevier Ltd. All rights reserved. (Figure 1). DOI 10.1016/j.sbi.2007.05.005 Prediction of protein interactions using structural information Introduction Understanding a requires knowledge of There are probably as many definitions of systems the network of interactions in space and time. In other biology as research institutes in the world. However, words, to understand who is interacting with whom, how a large number of will probably agree that these interactions affect the properties of the individual systems biology implies the quantitative understanding components, what are the properties of the complexes of a system, rather than of the individual components, formed, and how these interactions change in space and allowing testable predictions to be made [1]. As such, time. Determining protein–protein interactions has there- systems biology requires acquisition of data, parameter fore become one of the favorites of large-scale projects quantification, analysis and mathemat- ranging from pull-down assays [2,3,4] to full yeast two- ical modeling. Normally, systems biology is associated hybrid analysis [5–8]. Although much progress has been withgenome-widestudies,havinglittletodowith achieved in this area [9], we are still far from having 100% structural biology. If one would ask a systems coverage and accuracy [10]. Also, despite the progress in how they view proteins, protein complexes and so on, a projects and the existence of specific large number of them would describe them as dots, large-scale consortia aiming to determine the structures of devoid of three-dimensional information, with some macromolecular complexes (i.e. http://www.3drepertoire. associated biophysical properties. In many cases, not org/), we are far from having a full atomic description of even the surrounding cellular spatial information would all cellular complexes. The number of possible compl- be detailed. exes, the transient of many of them and inherent

Current Opinion in Structural Biology 2007, 17:378–384 www.sciencedirect.com Structures in systems biology Beltrao, Kiel and Serrano 379

Figure 1

Summary of the main concepts discussed in this review. Structural information can be used in many ways to help us retrieve the characteristic functional properties of cellular components. Here, we detail recent advances in the use of protein structures to predict protein interactions, protein function and quantitative binding parameters, curate large-scale protein interaction studies and understand the impact of coding variability. experimental difficulties make this goal difficult to system to the level of making successful predictions. achieve. Thus, in recent years, efforts have been made For this purpose, quantitative parameters (approximate in using available structural information to predict and or detailed depending on the problem [13]) are required. model the structures of interacting proteins (recently Currently, there are no high-throughput experimental reviewed in [11]). Although the prediction of protein– approaches to obtain these values. Thus, the possibility protein interactions using structural information is far of predicting thermodynamic and kinetic properties of from perfect, it is becoming a useful tool that enables not protein complexes, based on X-ray complex structures or only a Boolean assignment (yes or no) to a particular homology models, could be one of the major contri- putative interaction, but also the production of structural butions of structural biology to systems biology [14]. models, sometimes at very high resolution [12]. Particu- Affinities and kinetic constants are important for model- lar problems that remain to be solved are the modeling of ing cellular signal transduction pathways, as is done in loop conformations, backbone moves and docking. SmartCell (http://smartcell.embl.de)[15], whereby diffu- These problems are minimized if many structures are sion and cellular localization is taken into account. Suc- available of complexes involving members of the same cessful predictions of binding affinities for wild-type and family [12]. mutant complexes have been carried out using the protein design algorithm FoldX (http://foldx.embl.de) Quantitative data [16–18]. Examples are the prediction of Ras–effector In many cases, determining the network of interacting interactions [12,19–21], and interactions of PDZ and components is not enough to understand a biological SH3 domains with their targets [22,23]. www.sciencedirect.com Current Opinion in Structural Biology 2007, 17:378–384 380 Sequences and topology

Predicting binding affinities and hot-spot residues is also on the affinity alone, or whether individual association and important in rational design, to modify the binding speci- dissociation rate constants are important as well. ficity of ligands. This was successfully done for the TRAIL receptor system, for which DR5-selective TRAIL variants The role of structural in the were generated that do not induce apoptosis in DR4- post-genomic era responsive cell lines, but show a large increase in biological Recent efforts to map all possible interactions between activity in DR5-responsive cancer cell lines [24]. Other cellular components, in a high-throughput fashion, have examples are the successful creation of new specifically created very large data sets. Properly mined, this data interacting DNase–inhibitor pairs [25,26] and the rational should help us to better understand living cells. Extract- design of ICAM-1 mutants with enhanced affinity for its ing meaningful information from these data sets is, how- antigen (LFA-1) [27]. ever, not a simple task. Most studies of these interactions rely on a simplified network representation, whereby Approaches to predict association rate constants make use components are nodes and connections between them of the principle of electrostatic steering [28]. Based on this are denoted as edges. This has enabled the vast amount of concept, the association rate constant of a protein complex information to be grasped in a formal way, leading to the can be enhanced by increasing the electrostatic charge discovery of important and general global network prop- complementary at the interface and at the edge of the erties [32–34]. interface. The protein design algorithm PARE [29] was successfully developed to specifically enhance the rate of We will, undoubtedly, require much more rich detail to be association, while not affecting the dissociation rate of added to this representation if we are ever to comprehend various protein systems [29,30,31]. This method has how cellular components bring about cellular functions the potential to specifically change the kinetic properties (see Figure 2). Some studies have already started to add- of protein complexes involved in signal transduction path- ress this by discriminating between different node types ways and to investigate how the magnitude of signal trans- [35,36,37]. Vidal and colleagues have used expression data duction in vivo changes. Using this design tool, important to identify highly connected proteins (hubs) that interact biological questions can be addressed; for example, with their partners either simultaneously (party hubs) or at whether the magnitude of signal transduction is dependent different times (date hubs). Other studies have tried to

Figure 2

From nodes and edges to atomic detail. We should be able to use large-scale protein interaction data to obtain meaningful insight regarding cellular functions. Most of the studies so far, regarding these large data sets, have focused on a simplified formal description of interactions in which components are identically represented. From this ‘nodes and edges’ view, global network properties emerge [32–34]. For this information to be of more use, one must be able to find within the data set the modules that are traditionally studied in from the bottom up. Common to all these modules, we can find local network properties and universal node roles [35,36,37]. For example, no module exists in isolation, so connecting roles are necessary and universal to all cellular modules. We propose that structural information be used to further characterize large-scale networks by providing the key functional properties of cellular components.

Current Opinion in Structural Biology 2007, 17:378–384 www.sciencedirect.com Structures in systems biology Beltrao, Kiel and Serrano 381

study local network properties by first identifying modules Structural information for pathway modeling within the large networks and then classifying nodes The studies discussed above point to the usefulness of according to their pattern of intramodule or intermodule using structural information to curate large interaction connections [36,37]. The identification of these modules networks. The challenge will now be to show that it is and different node types brings network analysis closer to possible to use structural data to improve pathway models. more traditional studies of cellular pathways. It is our One very important aspect of modeling concerns the opinion that structural information can further help to identification of the key functional units, those that mostly bridge the divide (see Figure 2). determine the properties of the pathway. Once a module has been identified, from within the large interaction net- As we describe above, protein structures can be used to work, functional information on the components must be predict protein interactions and associated binding con- analyzed. stants. We should then be able to curate current inter- action networks using structural information to derive In cases in which the functional role of the protein is not pathway models that include rich structural detail (as known, structural information can be used to direct proposed by Aloy and Russell [11]). To enable this, experimental studies by predicting possible biological several databases have already been set up that specialize functions (reviewed in [46,47]). Successful examples of in repertoires of binding interfaces [38–43](seeTable 1). the structure-based prediction of protein function include Using one of these databases, Kim et al.[44]wereable the prediction of protein fold [48], binding pockets [49], to assign binding interfaces for 1269 of the protein– and interactions with DNA [50] and ions [18]. Prediction protein interactions of S. cerevisiae. They could then of protein fold can be used to transfer functional annota- identify mutually exclusive interactions and discriminate tions from other proteins with similar folds, in those protein hubs with many or few binding interfaces. They cases when sequence-based predictions are not possible. have shown that multi-interface hubs, compared to hubs Functional information can also be obtained from struc- with few binding interfaces, are more likely to be essen- ture-based prediction of protein interactions through tial and are, on average, more conserved, as evaluated by reasoning of ‘guilt-by-association’. In many cases, how- the ratio of non-synonymous to synonymous . ever, protein folds are functionally promiscuous, making They have also found that the interaction partners of the transfer of functional annotation difficult. Also, even if multi-interface hubs have a higher expression correlation structure-based predictions point to a very likely protein– (i.e. are expressed at the same time) than hubs with few protein interaction, it might not have an obvious func- interfaces. tional role in the pathway of interest. For example, in the case of Ras–effector interactions, high binding affinity We have used a similar approach to analyze how protein does not necessarily mean functional importance. In the binding specificity might influence the evolutionary turn- case of Ras signal transduction, Rap binds to RalGDS over of protein interactions [45]. We searched for human with high affinity, but no functional relevance has been proteins with multiple interactions through one domain found to date [51]. Therefore, structure-based prediction and compared this group to another with a similar number of functional importance should still be integrated with all of interactions through more than one domain. We found other possible data sources for maximum accuracy [52]. that, given the same number of partners, proteins inter- acting mostly through one domain have a higher rate of Once most components within the cellular module are change of interactions, in , than proteins inter- sufficiently characterized, structural details can be further acting through multiple domains. In conjunction with used to simplify the mathematical model of this pathway. other results, this has led us to propose that more pro- This can be accomplished, for example, by identifying miscuous protein binding domains have higher evolution- stable complexes. In most cases, it would be sufficient to ary turnover of their interactions. model these complexes as a single object instead of the

Table 1

Currently available databases of protein binding interfaces.

Database Domain definitions used Number of interfacesa References iPfam Pfam 3019 domain–domain interactions [38] 3did Pfam 3034 domain–domain interactions [39] SCOPPI SCOP 8400 interface types [40] PRISM NA 3799 interface clusters [41] SNAPPI-DB SCOP, CATH and Pfam NA [42] PIBASE SCOP 18755 interface types [43]

a The number of interfaces was obtained from the database web site or from the article, where available. Because of the different methods used, it is not possible to compare the numbers directly. NA, information not available. www.sciencedirect.com Current Opinion in Structural Biology 2007, 17:378–384 382 Sequences and topology

detailed modeling of all the components. One example Acknowledgements for which this reasoning has already been successfully We thank the EU for financial support (INTERACTION PROTEOME grant number LSHG-CT-2003-505520 and COMBIO grant number applied is the modeling of microtubule dynamics [53–55]. LSHG-CT-2004-503568). Pedro Beltrao is supported by a grant from Fundac¸a˜o para a Cieˆncia e Tecnologia through the Graduate Program in Pathways, structures and disease Areas of Basic and Applied Biology. One of the important deliverables of sequencing References and recommended reading projects is the realization of the diversity at single nucleo- Papers of particular interest, published within the period of review, tide level (single- polymorphism; SNP) and have been highlighted as: even at the level of gene copy number [56]. Currently,  of special interest more than 4.5 million unique SNPs have been identified  of outstanding interest and catalogued for the human [57]. A challenge is to identify which of these variations will have a pheno- 1. Bork P, Serrano L: Towards cellular systems in 4D. Cell 2005, 121:507-509. typical consequence and their context dependence on other variations at SNP or higher order level. There are 2. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM et al.: Functional several techniques available that rely on different para- organization of the yeast proteome by systematic analysis of meters, such as sequence conservation across , to protein complexes. Nature 2002, 415:141-147. try to predict the phenotypic consequences of SNPs [58]. 3. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams S-L, Millar A, Taylor P, Bennett K, Boutilier K et al.: Systematic However, for globular proteins, we now have the possibility identification of protein complexes in Saccharomyces of predicting the effect of SNPs on protein stability, cerevisiae by . Nature 2002, 415:180-183. aggregation, function and complex formation (i.e. SNPef- 4. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, fect [59]). Thus, the integration of this information for  Rau C, Jensen LJ, Bastuck S, Dumpelfeld B et al.: Proteome survey reveals modularity of the yeast cell machinery. pathways for which we have detailed information on Nature 2006, 440:631-636. parameters, interactions and structures of its components This study describes a genome-wide screen for complexes in budding could open the way to predicting, with high reliability, the yeast, using affinity purification and mass spectrometry. It compiled, in a systematic way, the largest collection of experimentally determined impact of SNPs on cellular processes and ultimately on eukaryotic cellular machines to date, revealing the modularity of human health. Moreover, using structural information and S. cerevisiae cellular network. existing protein design algorithms, it is possible to intro- 5. Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, duce point mutations to alter the affinity and specificity of Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N  et al.: Towards a proteome-scale map of the human protein- protein–protein interactions [24 ,25,26]. This is interest- protein interaction network. Nature 2005, 437:1173-1178. ing, on the one hand, for the development of novel peptide- 6. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, based therapies [24]. Also, by tuning protein interactions, Lockshon D, Narayan V, Srinivasan M, Pochart P et al.: A comprehensive analysis of protein-protein interactions in we could perturb, in a controlled way, particular cellular Saccharomyces cerevisiae. Nature 2000, 403:623-627. processes and obtain a better understanding of the bio- 7. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A logical system through reverse engineering [60]. comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 2001, Conclusions 98:4569-4574. We are still far from a working model of all cellular 8. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S et al.: A functions. We have summarized above what we believe human protein-protein interaction network: a for to be the role of structural biology in our pursuit to annotating the proteome. Cell 2005, 122:957-968. understand the most meaningful characteristics of cel- 9. Jensen LJ, Jensen TS, de Lichtenberg U, Brunak S, Bork P: lular components. Structural information has proven Co-evolution of transcriptional and post-translational cell-cycle regulation. Nature 2006, 443:594-597. useful for the prediction of protein function in general 10. Mathivanan S, Periaswamy B, Gandhi T, Kandasamy K, Suresh S, and, in particular, for the prediction of protein inter- Mohmood R, Ramachandra Y, Pandey A: An evaluation of actions and their respective binding constants. Incorpor- human protein-protein interaction data in the public domain. ating structural detail into current large-scale protein BMC Bioinformatics 2006, 7(suppl 5):S19. interaction networks has provided new insight into their 11. Aloy P, Russell RB: Structural systems biology: modelling properties. The work summarized here hints at many protein interactions. Nat Rev Mol Cell Biol 2006, 3:188-197. interesting challenges for future research. Structural 12. Kiel C, Wohlgemuth S, Rousseau F, Schymkowitz J, Ferkinghoff- Borg J, Wittinghofer F, Serrano L: Recognizing and defining true genomics should be directed at maximizing the coverage Ras binding domains II: In silico prediction based on homology of possible binding interfaces. New bioinformatics reso- modelling and calculations. J Mol Biol 2005, urces are necessary to atomically predict and assign 348:759-775. binding interfaces to cellular interactions. Finally, more 13. Di Ventura B, Lemerle C, Michalodimitrakis K, Serrano L: From in vivo to in silico biology and back. Nature 2006, 443:527-533. effort should be directed at correctly determining bind- ing constants for structurally resolved binding interfaces. 14. Kiel C, Serrano L: Affinity can have many faces: thermodynamic and kinetic properties of Ras-effector complex formation. Advances in these areas will greatly extend our capacity Curr Chem Biol 2007, 1:215-225. to model cellular pathways and enhance our understand- 15. Ander M, Beltrao P, Di Ventura B, Ferkinghoff-Borg J, Foglierini M, ing of cellular functions. Kaplan A, Lemerle C, Tomas-Oliveira I, Serrano L: SmartCell, a

Current Opinion in Structural Biology 2007, 17:378–384 www.sciencedirect.com Structures in systems biology Beltrao, Kiel and Serrano 383

framework to simulate cellular processes that combines 30. Kiel C, Selzer T, Shaul Y, Schreiber G, Herrmann C: stochastic approximation with diffusion and localisation: Electrostatically optimized Ras-binding Ral guanine analysis of simple networks. Syst Biol 2004, 1:129-138. dissociation stimulator mutants increase the rate of association by stabilizing the encounter complex. 16. Guerois R, Nielsen JE, Serrano L: Predicting changes in the Proc Natl Acad Sci USA 2004, 101:9223-9228. stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 2002, 320:369-387. 31. Peleg-Shulman T, Roisman LC, Zupkovitz G, Schreiber G: Optimizing the binding affinity of a carrier protein: a case study 17. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L: on the interaction between soluble ifnar2 and interferon beta. The FoldX web server: an online force field. Nucleic Acids Res J Biol Chem 2004, 279:18046-18053. 2005, 33:W382-W388. 32. Barabasi A-L, Oltvai ZN: Network biology: understanding 18. Schymkowitz JWH, Rousseau F, Martins IC, Ferkinghoff-Borg J, the cell’s functional organization. Nat Rev Genet 2004, Stricher F, Serrano L: Prediction of water and metal binding 5:101-113. sites and their affinities by using the Fold-X force field. Proc Natl Acad Sci USA 2005, 102:10147-10152. 33. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL: The large-scale organization of metabolic networks. 19. Kiel C, Serrano L, Herrmann C: A detailed thermodynamic Nature 2000, 407:651-654. analysis of Ras/effector complex interfaces. J Mol Biol 2004, 340:1039-1058. 34. Jeong H, Mason SP, Barabasi AL, Oltvai ZN: Lethality and centrality in protein networks. Nature 2001, 411:41-42. 20. Wohlgemuth S, Kiel C, Kraemer A, Serrano L, Wittinghofer F, Herrmann C: Recognizing and defining true Ras binding 35. Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, domains I: Biochemical analysis. J Mol Biol 2005, 348:741-758. Dupuy D, Walhout AJ, Cusick ME, Roth FP et al.: Evidence for dynamically organized modularity in the yeast protein-protein 21. Kiel C, Serrano L: The ubiquitin domain superfold: structure- interaction network. Nature 2004, 430:88-93. based sequence alignments and characterization of binding epitopes. J Mol Biol 2006, 355:821-844. 36. Guimera R, Nunes Amaral LA: Functional cartography of  complex metabolic networks. Nature 2005, 433:895-900. 22. Kempkens O, Me´ dina E, Fernandez-Ballester G, O¨ zu¨ yaman S, Le This study shows that it is possible to discriminate between different node Bivic A, Serrano L, Knust E: Computer modelling in combination types, in metabolic networks, by analyzing the edges occurring within with in vitro studies reveals similar binding affinities of and/or between modules. The ability to use topological information to Drosophila Crumbs for the PDZ domains of Stardust and identify modules and different node types underscores the usefulness of DmPar-6. Eur J Cell Biol 2006, 8:753-767. the network formalism for the study of cellular pathways. 23. Musi V, Birdsall B, Fernandez-Ballester G, Guerrini R, Salvatori S, 37. Guimera` R, Sales-Pardo M, Nunes Amaral LA: Classes of Serrano L, Pastore A: New approaches to high-throughput complex networks defined by role-to-role connectivity structure characterization of SH3 complexes: the example of profiles. Nature Physics 2007, 3:63-69. myosin-3 and myosin-5 SH3 domains from S. cerevisiae. Protein Sci 2006, 4:795-807. 38. Finn RD, Marshall M, Bateman A: iPfam: visualization of protein- protein interactions in PDB at domain and 24. Van der Sloot AM, Tur V, Szegezdi E, Mullally MM, Cool RH, resolutions. Bioinformatics 2005, 21:410-412.  Samali A, Serrano L, Quax WJ: Designed tumor necrosis factor- related apoptosis-inducing ligand variants initiating apoptosis 39. Stein A, Russell RB, Aloy P: 3did: interacting protein domains of exclusively via the DR5 receptor. Proc Natl Acad Sci USA 2006, known three-dimensional structure. Nucleic Acids Res 2005, 103:8634-8639. 33:D413-D417. By using the automatic design algorithm FOLD-X, DR5-selective TRAIL variants have been generated. These variants do not induce apoptosis in 40. Winter C, Henschel A, Kim WK, Schroeder M: SCOPPI: a DR4-responsive cell lines, but show a large increase in biological activity structural classification of protein-protein interfaces. in DR5-responsive cancer cell lines. This study shows that the specificity Nucleic Acids Res 2006, 34:D310-D314. of protein–protein interactions can be successfully modified using com- 41. Ogmen U, Keskin O, Aytuna AS, Nussinov R, Gursoy A: PRISM: putational protein design without having high-quality structural data protein interactions by structural matching. Nucleic Acids Res available for all relevant interactions. Moreover, it demonstrates that 2005, 33:W331-W336. changing the specificity of protein–protein interactions using computa- tional design could be a valuable approach for designing novel thera- 42. Jefferson ER, Walsh TP, Roberts TJ, Barton GJ: SNAPPI-DB: a peutics. database and API of structures, interfaces and alignments for protein-protein interactions. Nucleic Acids Res 2007, 25. Kortemme T, Joachimiak LA, Bullock AN, Schuler AD, 35:D580-D589. Stoddard BL, Baker D: Computational redesign of protein- protein interaction specificity. Nat Struct Mol Biol 2004, 43. Davis FP, Sali A: PIBASE: a comprehensive database of 11:371-379. structurally defined protein interfaces. Bioinformatics 2005, 21:1901-1907. 26. Joachimiak LA, Kortemme T, Stoddard BL, Baker D: Computational design of a new hydrogen bond network at a 44. Kim PM, Lu LJ, Xia Y, Gerstein MB: Relating three-dimensional protein-protein interface. J Mol Biol 2006, 361:195-208.  structures to protein networks provides evolutionary insights. Science 2006, 314:1938-1941. 27. Song G, Lazar GA, Kortemme T, Shimaoka M, Desjarlais JR, This is the first large-scale curation effort using structural information on Baker D, Springer TA: Rational design of intercellular adhesion binding interfaces. Using the iPfam database, the authors determined the -1 (ICAM-1) variants for antagonizing integrin most likely binding interface for all S. cerevisiae protein–protein interac- lymphocyte function-associated antigen-1-dependent tions. This information was used to show that protein essentiality/lethality, adhesion. J Biol Chem 2006, 281:5042-5049. protein sequence conservation and co-expression of binding partners are 28. Sheinerman FB, Norel R, Honig B: Electrostatic aspects of dependent on the number of binding interfaces. protein–protein interactions. Curr Opin Struct Biol 2000, 45. Beltrao P, Serrano L: Specificity and evolvability in eukaryotic 10:153-159. protein interaction networks. PLoS Comput Biol 2007, 3:e25. 29. Selzer T, Albeck S, Schreiber G: Rational design of faster 46. Watson JD, Laskowski RA, Thornton JM: Predicting protein associating and tighter binding protein complexes.  function from sequence and structural data. Nat Struct Biol 2000, 7:537-541. Curr Opin Struct Biol 2005, 3:275-284. PARE is a protein design strategy to enhance the rate constant of associa- tion by increasing the electrostatic attraction between a protein pair. A very 47. Rigden DJ: Understanding the cell in terms of structure high correlation is found between experimentally determined association and function: insights from structural genomics. rate constants for mutants and calculated changes in association rate Curr Opin Biotechnol 2006, 5:457-464. constants relative to the wild-type complex. Using this design strategy and maintaining the dissociation rate constant, the affinity between TEM1 and 48. Dunbrack RL Jr: Sequence comparison and its protein inhibitor, BLIP, could be enhanced 250-fold. prediction. Curr Opin Struct Biol 2006, 16:374-384. www.sciencedirect.com Current Opinion in Structural Biology 2007, 17:378–384 384 Sequences and topology

49. Reichmann D, Rahat O, Cohen M, Neuvirth H, Schreiber G: The 55. Pearson CG, Gardner MK, Paliulis LV, Salmon ED, Odde DJ, molecular architecture of protein-protein binding sites. Bloom K: Measuring nanometer scale gradients in spindle Curr Opin Struct Biol 2007, 17:67-76. microtubule dynamics using model convolution microscopy. Mol Biol Cell 2006, 17:4069-4079. 50. Kim JS, DeGiovanni A, Jancarik J, Adams PD, Yokota H, Kim R, Kim SH: Crystal structure of DNA sequence specificity subunit 56. Sachidanandam R, Weissman D, Schmidt S, Kakol J, Stein L, of a type I restriction-modification and its functional Marth G, Sherry S, Mullikin J, Mortimore B, Willey D et al.: A map of implications. Proc Natl Acad Sci USA 2005, 102:3248-3253. human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 2001, 409:928-933. 51. Kooistra MRH, Dube N, Bos JL: Rap1: a key regulator in cell-cell junction formation. J Cell Sci 2007, 120:17-22. 57. Thorisson GA, Stein LD: The SNP Consortium website: past, present and future. Nucleic Acids Res 2003, 31:124-127. 52. Lu LJ, Xia Y, Paccanaro A, Yu H, Gerstein M: Assessing the limits of genomic data integration for predicting protein networks. 58. Ramensky V, Bork P, Sunyaey S: Human non-synonymous Genome Res 2005, 15:945-953. SNPs: server and survey. Nucleic Acids Res 2002, 30:3894-3900. 53. Pearson CG, Yeh E, Gardner M, Odde D, Salmon ED, Bloom K: 59. Reumers J, Schymkowitz J, Ferkinghoff-Borg J, Stricher F, Stable kinetochore-microtubule attachment constrains Serrano L, Rousseau F: SNPeffect: a database mapping centromere positioning in metaphase. Curr Biol 2004, molecular phenotypic effects of human non-synonymous 14:1962-1967. coding SNPs. Nucleic Acids Res 2005, 33:D527-D532. 54. Gardner MK, Pearson CG, Sprague BL, Zarzar TR, Bloom K, 60. Tegner J, Yeung MK, Hasty J, Collins JJ: Reverse engineering Salmon ED, Odde DJ: Tension-dependent regulation of gene networks: Integrating genetic perturbations with microtubule dynamics at kinetochores can explain metaphase dynamical modelling. Proc Natl Acad Sci USA 2003, congression in yeast. Mol Biol Cell 2005, 16:3764-3775. 100:5944-5949.

Current Opinion in Structural Biology 2007, 17:378–384 www.sciencedirect.com