Protein Interaction Networks and Their Applications to Protein Characterization and Cancer Genes Prediction
Total Page:16
File Type:pdf, Size:1020Kb
Ramón Aragüés Peleato Protein interaction networks and their applications to protein characterization and cancer genes prediction PhD Thesis Barcelona, May 2007 1 The image of the cover shows the happiness protein interaction network (i.e. the protein interaction network for proteins involved in the serotonin pathway) ii Protein Interaction Networks and their Applications to Protein Characterization and Cancer Genes Prediction Ramón Aragüés Peleato Memòria presentada per optar al grau de Doctor en Biologia per la Universitat Pompeu Fabra. Aquesta Tesi Doctoral ha estat realitzada sota la direcció del Dr. Baldo Oliva al Departament de Ciències Experimentals i de la Salut de la Universitat Pompeu Fabra Baldo Oliva Miguel Ramón Aragüés Peleato Barcelona, Maig 2007 The research in this thesis has been carried out at the Structural Bioinformatics Lab (SBI) within the Grup de Recerca en Informàtica Biomèdica at the Parc de Recerca Biomèdica de Barcelona (PRBB). The research carried out in this thesis has been supported by a “Formación de Personal Investigador (FPI)” grant from the Ministerio de Educación y Ciencia awarded to Dr. Baldo Oliva. A mis padres, que me hicieron querer conseguirlo A Natalia, que me ayudó a conseguir hacerlo ii TABLE OF CONTENTS TABLE OF CONTENTS................................................................................................................................... I ACKNOWLEDGEMENTS............................................................................................................................ III THESIS ABSTRACT........................................................................................................................................V CHAPTER I - INTRODUCTION.....................................................................................................................7 1.1 INTRODUCTION OVERVIEW .........................................................................................................................9 1.2 BACKGROUND .......................................................................................................................................9 1.2.1 Molecular cell biology.......................................................................................................................9 1.2.2 Bioinformatics and Computational Biology ....................................................................................11 1.3 GENOMICS AND PROTEOMICS ...................................................................................................................12 1.3.1 Two-dimensional gel electrophoresis and mass spectrometry.........................................................12 1.3.2 Three-dimensional structure determination.....................................................................................13 1.3.3 Gene Expression Profiling...............................................................................................................13 1.3.4 Protein chips....................................................................................................................................13 1.3.5 ChIp-on-Chip...................................................................................................................................14 1.3.6 Fluorescent tagging.........................................................................................................................14 1.3.7 Methods for the detection of protein-protein interactions ...............................................................14 1.4 REPOSITORIES OF BIOLOGICAL INFORMATION ..........................................................................................18 1.4.1 Bio-sequences repositories ..............................................................................................................18 1.4.2 Gene expression studies repositories...............................................................................................18 1.4.3 Protein domains, families and functional sites ................................................................................18 1.4.4 Protein function repositories ...........................................................................................................20 1.4.5 Protein-protein interactions repositories.........................................................................................21 1.5 PROTEIN INTERACTION NETWORKS...........................................................................................................23 1.5.1 Protein interaction networks properties ..........................................................................................25 1.5.2 Protein interaction networks applications.......................................................................................26 1.6 BIOLOGICAL DATA INTEGRATION .............................................................................................................27 1.6.1 Genes and proteins nomenclature ...................................................................................................29 1.6.2 Protein-protein interactions integration..........................................................................................30 1.7 SOFTWARE ...............................................................................................................................................32 1.7.1 Software for translating gene and protein identifiers......................................................................32 1.7.2 Software for protein interaction data integration, analysis and visualization.................................32 1.8 THE ROLE OF BIOINFORMATICS IN CANCER RESEARCH .............................................................................34 1.9 DOCUMENT ORGANIZATION......................................................................................................................36 OBJECTIVES ..................................................................................................................................................39 LIST OF PUBLICATIONS.............................................................................................................................41 CHAPTER II - PIANA (PROTEIN INTERACTIONS AND NETWORK ANALYSIS) ..........................42 CHAPTER III - PROTEIN HUBS CHARACTERIZATION BY INFERRING INTERACTING MOTIFS FROM PROTEIN INTERACTIONS ............................................................................................66 CHAPTER IV - AN INTEGRATIVE APPROACH TO PREDICTING CANCER GENES....................92 CHAPTER V - OTHER APPLICATIONS OF PROTEIN-PROTEIN INTERACTIONS.....................124 CHAPTER VI - DISCUSSION .....................................................................................................................146 6.1.1 Providing universal access to PIANA............................................................................................149 6.1.2 Representation and analysis of transient and permanent interactions ..........................................150 6.1.3 Biological Interactions And Network Analysis (BIANA) ...............................................................150 6.1.4 Interaction Confidence Score ........................................................................................................151 6.1.5 Visualization of protein interaction networks................................................................................152 6.1.6 Integrating sequence information into the method for delineating interacting motifs...................152 6.1.7 PIANA and Diseases......................................................................................................................153 6.1.8 Using PIANA to detect remote homologs ......................................................................................153 6.1.9 The path is consensus, not integration...........................................................................................154 CONCLUSIONS ............................................................................................................................................156 EPILOGUE.....................................................................................................................................................160 THESIS REFERENCES................................................................................................................................162 ii ACKNOWLEDGEMENTS I know this is the section of the thesis that most people read; in fact, it might be the only section of the thesis most people ever think about reading. Maybe in your case this acknowledgements thing is even more pronounced than it usually is, and it has become the central part of your life and it is all you care about. Or perhaps, you didn’t even know there was an acknowledgements section in a PhD thesis, and you just told yourself: “let’s read it!”. Whatever it is that made you read this, you are going to regret it, because these are going to be the most disappointing acknowledgements in human history. At least, for those who do not know the answers… http://www.aragues.com/thesis_ack.php iii iv THESIS ABSTRACT The importance of understanding cellular processes has prompted the development of experimental approaches that detect protein-protein interactions. Recently, most approaches have been focused on large-scale screenings of protein-protein interactions, such as two- hybrid assays and affinity purifications followed by mass spectrometry.