University of California, San Diego

Total Page:16

File Type:pdf, Size:1020Kb

University of California, San Diego UNIVERSITY OF CALIFORNIA, SAN DIEGO Understanding cellular function through the analysis of protein interaction networks A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Bioinformatics by Silpa Suthram Committee in charge: Professor Trey G. Ideker, Chair Professor Vineet Bafna Professor Philip Bourne Professor Jeff Hasty Professor Daniel O’Connor ` 2008 Copyright Silpa Suthram, 2008 All rights reserved. The Dissertation of Silpa Suthram is approved, and it is acceptable in quality and form for publication on microfilm: Chair University of California, San Diego 2008 iii To Sanjit I couldn’t have done it without you. iv Aum Tamaso Ma Jyotir Gamaya || (Brhadaranyaka Upanishad — I.iii.28) Translation: Lead me from darkness (ignorance) to light (knowledge). v Table of Contents Signature Page...........................................................................................................iii Dedication.................................................................................................................. iv Epigraph .......................................................................................................................v Table of Contents ..................................................................................................... vi List of Figures.............................................................................................................ix List of Tables ...............................................................................................................x Acknowledgements...................................................................................................xi Vita, Publications .....................................................................................................xv Abstract..................................................................................................................... xvi 1 Introduction.........................................................................................................1 1.1 Generation and visualization of large‐scale protein interaction data..3 1.1.1 Yeast‐two hybrid assay...........................................................................3 1.1.2 Tandem Affinity Purification (TAP) followed by mass‐spectrometry...4 1.1.3 Visualization of protein interaction networks........................................5 1.2 Overview of the dissertation......................................................................6 1.2.1 Noise in the data.....................................................................................7 1.2.2 Comparative network analysis ...............................................................8 1.2.3 Network integration ...............................................................................9 2 Assigning confidence scores to protein interactions.................................15 2.1 Estimation of interaction probabilities ...................................................18 2.2 Benchmarking analysis.............................................................................20 2.3 Quality assessment....................................................................................27 2.3.1 Presence of conserved interactions in other species..............................28 2.3.2 Expression correlation..........................................................................29 vi 2.3.3 Gene Ontology (GO) similarity ...........................................................30 2.3.4 Signal‐to‐Noise Ratio of protein complexes .........................................32 2.3.5 Conservation rate coherency ................................................................33 2.4 Discussion...................................................................................................34 2.5 Methods ......................................................................................................37 2.5.1 GO databases........................................................................................37 2.5.2 Signal to noise ratio (SNR) ..................................................................38 2.5.3 Weighted average..................................................................................39 3 Comparative network analysis ......................................................................47 3.1 NetworkBLAST Algorithm ......................................................................49 3.1.1 Network alignment graph ....................................................................49 3.1.2 A probabilistic model of protein sub‐networks.....................................51 3.1.3 Search algorithm...................................................................................54 3.2 Complexes conserved across multiple species......................................57 3.2.1 Three‐way versus two‐way network alignment...................................61 3.2.2 Prediction of new protein functions .....................................................62 3.2.3 Prediction of new protein interactions .................................................63 3.3 Discussion...................................................................................................65 3.3.1 Comparison to existing methods ..........................................................65 3.3.2 Best BLAST hits may not imply functional conservation ...................66 3.3.3 Functional links within conserved networks........................................67 3.3.4 Validation of predicted interactions .....................................................69 3.3.5 Conclusion............................................................................................70 3.4 Methods ......................................................................................................71 3.4.1 Scoring functional enrichment.............................................................71 3.4.2 Prediction of protein functions.............................................................72 3.4.3 Prediction of protein interactions.........................................................75 3.4.4 Automatic layout of conserved complexes............................................78 4 Comparative network analysis of the protein network of the malaria parasite Plasmodium falciparum ...........................................................................94 4.1 Cross‐species comparison of the Plasmodium protein network........95 4.1.1 Data sources .........................................................................................95 4.1.2 Comparative analysis of the Plasmodium PPI network.......................96 4.1.3 Analysis of conserved and distinct complexes in Plasmodium..........101 4.1.4 Distribution of GO Cellular Components across species...................104 4.2 Discussion.................................................................................................106 vii 4.3 Methods ....................................................................................................108 4.3.1 Identification of conserved and species only complexes .....................108 4.3.2 Interaction confidence scores..............................................................109 4.3.3 Phylogenetic tree construction...........................................................109 4.3.4 SNR of protein complexes ..................................................................110 4.3.5 Simulating false‐positives and negatives in protein networks...........111 5 Network Integration: An efficient method for interpreting eQTL associations using protein networks ..................................................................124 5.1 “Electric circuit” analysis of eQTLs ......................................................126 5.1.1 Definition of terms .............................................................................126 5.1.2 Open problems motivated by previous method ..................................127 5.1.3 The eQED model ................................................................................128 5.1.4 Application to eQTL associations in yeast.........................................129 5.1.5 Combining multiple loci.....................................................................131 5.1.6 Predicting the direction of signaling along protein interactions .......132 5.1.7 Prediction of regulatory pathways .....................................................134 5.1.8 Application to gene‐association studies in human.............................135 5.2 Materials and Methods ...........................................................................137 5.2.1 Electric circuits and random walks....................................................137 5.2.2 eQTL Associations .............................................................................139 5.2.3 High‐confidence physical interaction network...................................140 5.2.4 eQED Model.......................................................................................143 6 Conclusion .......................................................................................................150 6.1 Future directions......................................................................................151 7 References ........................................................................................................155 viii List of Figures Figure 1.1: Schematic of the yeast two‐hybrid assay...........................................12 Figure
Recommended publications
  • The 7Th White Coat Ceremony at SSPPS by Binh Tran, Pharmd
    The 7th White Coat Ceremony at SSPPS By Binh Tran, PharmD The White Coat Ceremony at the Skaggs School of Pharmacy and Pharmaceutical Sciences (SSPPS) on September 25th held special meaning as the University of California - San Diego celebrates its 50th Anniversary this year. SSPPS faculty, alumni, student pharmacists, the incoming class of 2014, their families and their friends listened to Dean Palmer Taylor’s welcoming address in the spacious Health Sciences auditorium modeled with panels made of eucalyptus wood. Dr Rita Shane, Director of Pharmacy Services at Cedars-Sinai Medical Center and Assistant Dean of Clinical Pharmacy at the UCSF School of Pharmacy delivered an energizing speech on The Power of Differentiation: Will you be a Brand or Generic? Attributes mentioned were competency, responsibility and accountability, intellectual curiosity, being able to seize opportunities, and establishing credibility. Dr. Anthony Manoguerra, Associate Dean for Student Affairs, reported that the class of 2014’s grade point average of 3.71 broke the record set by the previous class which averaged 3.70. The White Coat hooding by Dean Taylor was a moving ceremony with Justine Abella being last due to a recent foot injury. The class recited the Pharmacist’s Oath, which brought back memories of our own experiences. The Student Welcome by second- year student, Ammar Zanial, was heartfelt and reminiscent of his comments during pharmacy conference class. Little did I know at the time that the student pharmacist who volunteered at many outreaches in the community and helped backstage during Cultural Fusion Night was the President of his class. In closing, Dean Taylor commented on the ingeniousness of the class when he observed them tacitly move their seats in order to step closer to the stage.
    [Show full text]
  • Are You an Invited Speaker? a Bibliometric Analysis of Elite Groups for Scholarly Events in Bioinformatics
    Are You an Invited Speaker? A Bibliometric Analysis of Elite Groups for Scholarly Events in Bioinformatics Senator Jeong, Sungin Lee, and Hong-Gee Kim Biomedical Knowledge Engineering Laboratory, Seoul National University, 28–22 YeonGeon Dong, Jongno Gu, Seoul 110–749, Korea. E-mail: {senator, sunginlee, hgkim}@snu.ac.kr Participating in scholarly events (e.g., conferences, work- evaluation, but it would be hard to claim that they have pro- shops, etc.) as an elite-group member such as an orga- vided comprehensive lists of evaluation measurements. This nizing committee chair or member, program committee article aims not to provide such lists but to add to the current chair or member, session chair, invited speaker, or award winner is beneficial to a researcher’s career develop- practices an alternative metric that complements existing per- ment.The objective of this study is to investigate whether formance measures to give a more comprehensive picture of elite-group membership for scholarly events is represen- scholars’ performance. tative of scholars’ prominence, and which elite group is By one definition (Jeong, 2008), a scholarly event is the most prestigious. We collected data about 15 global “a sequentially and spatially organized collection of schol- (excluding regional) bioinformatics scholarly events held in 2007. We sampled (via stratified random sampling) ars’ interactions with the intention of delivering and shar- participants from elite groups in each event. Then, bib- ing knowledge, exchanging research ideas, and performing liometric indicators (total citations and h index) of seven related activities.” As such, scholarly events are communica- elite groups and a non-elite group, consisting of authors tion channels from which our new evaluation tool can draw who submitted at least one paper to an event but were its supporting evidence.
    [Show full text]
  • Your Scientific Reputation Class Slides
    Your Scientific Reputation and Being a Responsible Member of Society Peter D Sottile, MD October 2020 • Silence personal devices. • Stay muted when not talking. • Set up in a quiet location. Housekeeping: Zoom • Remain attentive. Avoid checking Etiquette: email/phone/web. • Use the Chat function to ask questions or get technical help. • Use your full name, not an alias. Receiving credit for attendance: To satisfy the NIH Requirement for Instruction in the Responsible Conduct of Research, the following are required in order to receive credit for attendance: Attend the full 90 minutes of the training. Attending any 8 out of the 9 RCR seminars we offer will satisfy the NIH requirement. Keep your video camera on throughout the session. NIH requirements for RCR training specify face- to-face discussion. Participate interactively throughout the session. Participate in discussions, respond to polls, and sign the attendance sheet (link will be distributed in the Chat). Raise your Hand to participate in discussions: In order to participate in discussions, raise your hand. Try it now! • Click “Participants” at the bottom of your screen. • Click “Raise Hand” in the popup window. • Click “Lower Hand” to stop raising your hand. 3 Objectives: • Explain why a good scientific reputation is important in academia • Describe the factors that contribute to a good scientific reputation • Identify the components to being a responsible member of the scientific community • Describe threats that may harm one’s scientific reputation, and responses to challenges
    [Show full text]
  • Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by Pickeringhutchins Ltd
    European Bioinformatics Institute EMBL-EBI Annual Scientific Report 2011 Annual Scientific Report 2011 Designed and Produced by PickeringHutchins Ltd www.pickeringhutchins.com EMBL member states: Austria, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom. Associate member state: Australia EMBL-EBI is a part of the European Molecular Biology Laboratory (EMBL) EMBL-EBI EMBL-EBI EMBL-EBI EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD United Kingdom Tel. +44 (0)1223 494 444, Fax +44 (0)1223 494 468 www.ebi.ac.uk EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg Germany Tel. +49 (0)6221 3870, Fax +49 (0)6221 387 8306 www.embl.org [email protected] EMBL Grenoble 6, rue Jules Horowitz, BP181 38042 Grenoble, Cedex 9 France Tel. +33 (0)476 20 7269, Fax +33 (0)476 20 2199 EMBL Hamburg c/o DESY Notkestraße 85 22603 Hamburg Germany Tel. +49 (0)4089 902 110, Fax +49 (0)4089 902 149 EMBL Monterotondo Adriano Buzzati-Traverso Campus Via Ramarini, 32 00015 Monterotondo (Rome) Italy Tel. +39 (0)6900 91402, Fax +39 (0)6900 91406 © 2012 EMBL-European Bioinformatics Institute All texts written by EBI-EMBL Group and Team Leaders. This publication was produced by the EBI’s Outreach and Training Programme. Contents Introduction Foreword 2 Major Achievements 2011 4 Services Rolf Apweiler and Ewan Birney: Protein and nucleotide data 10 Guy Cochrane: The European Nucleotide Archive 14 Paul Flicek:
    [Show full text]
  • BD2K Program Book Cover
    Table of Contents Agenda Session Descriptions Birds of a Feather Information Birds of a Feather Descriptions Floor Plan of Bethesda North Marriott Hotel & Conference Center Poster/Demo Presentations Information Poster/Demo Presentations Index Poster/Demo Presentations Map Abstract Information Abstract Index Abstracts, in agenda order by the following topics: Research Highlights Data Commons Standards Development Training & Workforce Development BioCADDIE & Resource Indexing Software, Analysis, & Methods Development Collaborative Presentations Sustainability Keynote Speaker Biographies Open Science Prize Information Open Science Prize Fact Sheet Open Science Prize Public Voting Information Open Science Prize Advisors Dining Options Local Restaurants and Hotel Gift Shop Menu Big Data to Knowledge (BD2K) All Hands Meeting and Open Data Science Symposium November 29 - December 1, 2016 Bethesda North Marriott Hotel & Conference Center #BD2KOpenSci | #BD2K_AHM | Info: [email protected] AGENDA Tuesday, November 29, 2016 (Day 1) 7:30 a.m. – 8:30 a.m. Registration/Check-in and Poster Set-up (Registration) 8:30 a.m. – 8:45 a.m. Welcome and Meeting Organization: Jennie Larkin (Salons E-H) 8:45 a.m. – 9:15 a.m. BD2K at NIH – A Vision Through 2020: Philip Bourne NOTES 9:15 a.m. – 10:00 a.m. Keynote Address: Robert Califf, Commissioner, U.S. Food and Drug Administration 10:00 a.m. – 10:30 a.m. Coffee Break & Posters (Salons A-D) 10:30 a.m. – 12:00 p.m. Concurrent Sessions: A. Research Highlights #1 (Session Organizers: Jennie Larkin & Philip Bourne) (Salon E) NOTES B. Data Commons (Session Organizer: Vivien Bonazzi) (Salons F-H) NOTES 12:00 p.m. – 1:00 p.m.
    [Show full text]
  • Protein Family Expansions and Biological Complexity
    Protein Family Expansions and Biological Complexity Christine Vogel1,2*, Cyrus Chothia1 1 Medical Research Council Laboratory of Molecular Biology, Cambridge, United Kingdom, 2 Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America During the course of evolution, new proteins are produced very largely as the result of gene duplication, divergence and, in many cases, combination. This means that proteins or protein domains belong to families or, in cases where their relationships can only be recognised on the basis of structure, superfamilies whose members descended from a common ancestor. The size of superfamilies can vary greatly. Also, during the course of evolution organisms of increasing complexity have arisen. In this paper we determine the identity of those superfamilies whose relative sizes in different organisms are highly correlated to the complexity of the organisms. As a measure of the complexity of 38 uni- and multicellular eukaryotes we took the number of different cell types of which they are composed. Of 1,219 superfamilies, there are 194 whose sizes in the 38 organisms are strongly correlated with the number of cell types in the organisms. We give outline descriptions of these superfamilies. Half are involved in extracellular processes or regulation and smaller proportions in other types of activity. Half of all superfamilies have no significant correlation with complexity. We also determined whether the expansions of large superfamilies correlate with each other. We found three large clusters of correlated expansions: one involves expansions in both vertebrates and plants, one just in vertebrates, and one just in plants.
    [Show full text]
  • Crossm Mbio.Asm.Org
    Downloaded from PERSPECTIVE crossm mbio.asm.org Identifying and Overcoming Threats to Reproducibility, on June 11, 2018 - Published by Replicability, Robustness, and Generalizability in Microbiome Research Patrick D. Schlossa aDepartment of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA mbio.asm.org ABSTRACT The “reproducibility crisis” in science affects microbiology as much as any other area of inquiry, and microbiologists have long struggled to make their re- search reproducible. We need to respect that ensuring that our methods and results are sufficiently transparent is difficult. This difficulty is compounded in interdisciplin- ary fields such as microbiome research. There are many reasons why a researcher is unable to reproduce a previous result, and even if a result is reproducible, it may not be correct. Furthermore, failures to reproduce previous results have much to teach us about the scientific process and microbial life itself. This Perspec- tive delineates a framework for identifying and overcoming threats to reproducibil- ity, replicability, robustness, and generalizability of microbiome research. Instead of seeing signs of a crisis in others’ work, we need to appreciate the technical and social difficulties that limit reproducibility in the work of others as well as our own. KEYWORDS American Academy of Microbiology, microbiome, reproducibility, research ethics, scientific method n first blush, one might argue that any scientist should be able to reproduce Oanother scientist’s research with no friction. Yet, two anecdotes suffice to describe why this is not the case. The first goes to the roots of microbiology, when Antonie van Leeuwenhoek submitted a letter to the Royal Society in 1677, “Concerning little animals” (1).
    [Show full text]
  • Program Schedule
    2015 IEEE International Conference on Bioinformatics and Biomedicine Nov 9-12, 2015, Washington D.C. , USA 1 Sponsored by 2 IEEE BIBM 2015 IEEE BIBM 2015 Program Schedule ..................................................................................................................................... 4 Floor Plan ................................................................................................................................................................................ 9 Keynote Lectures .................................................................................................................................................................. 10 Invited Talks ......................................................................................................................................................................... 12 Federal Agencies and Industry Panel .................................................................................................................................... 15 Workshops ............................................................................................................................................................................ 16 Tutorials ................................................................................................................................................................................ 26 Conference Paper Presentations ...........................................................................................................................................
    [Show full text]
  • Governing Medical Knowledge Commons
    Downloaded from https://www.cambridge.org/core. IP address: 170.106.202.126, on 02 Oct 2021 at 13:46:24, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://www.cambridge.org/core/product/C4CAF56DF197BA93E850E65509EB36E8 Downloaded from https://www.cambridge.org/core. IP address: 170.106.202.126, on 02 Oct 2021 at 13:46:24, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://www.cambridge.org/core/product/C4CAF56DF197BA93E850E65509EB36E8 governing medical knowledge commons Governing Medical Knowledge Commons makes three claims: first, evidence matters to innovation policymaking; second, evidence shows that self-governing knowledge com- mons support effective innovation without prioritizing traditional intellectual property rights; and third, knowledge commons can succeed in the critical fields of medicine and health. The editors’ knowledge commons framework adapts Elinor Ostrom’s ground- breaking research on natural resource commons to the distinctive attributes of knowl- edge and information, providing a systematic means for accumulating evidence about how knowledge commons succeed. The editors’ previous volume, Governing Knowledge Commons, demonstrated the framework’s power through case studies in a diverse range of areas. Governing Medical Knowledge Commons provides 15 new case studies of knowl- edge commons in which researchers, medical professionals, and patients generate, improve, and share innovations, offering readers a practical introduction to the knowl- edge commons framework and a synthesis of conclusions and lessons. The book is available Open Access at http://dx.doi.org/10.1017/9781316544587. Katherine J. Strandburg is the Alfred B. Engelberg Professor of Law at New York University School of Law, where she specializes in patent law, innovation policy, and information privacy law.
    [Show full text]
  • Pacific Symposium on Biocomputing 2010
    Pacific Symposium on Biocomputing 15:v-viii PACIFIC SYMPOSIUM ON BIOCOMPUTING 2010 This year marks the 15th year of PSB. Started in 1996 by Teri Klein and Larry Hunter, the meeting was a session within the Hawaii International Conference on Systems Sciences Conference (HICSS) for a couple of years. The interest in biocomputing was great and was threatening to alter the balance of attendees within the HICSS meeting, and so the organizers asked Teri and Larry to find an alternative. Taking advantage of the nice meeting opportunities in Hawaii immediately following New Year’s Day, they created PSB with a simple formula: oral presentation of peer reviewed papers in emerging areas of biocomputing, interactive poster sessions, and lively discussion sessions on these topics. They invited Keith Dunker and Russ Altman to join the team for the second PSB, and we have been working together ever since. In fifteen years, we have had some papers that have made remarkable impact. Using Google Scholar, the top 15 papers, in terms of citations over the last 15 years are listed below (with number of citations listed as of September 2009, followed by the reference). Obviously, there are other great papers, especially ones that have been written in the last few years, that have not yet had time to rise to the top of this list. Nevertheless, the list provides highlights of the last 15 years in biocomputing challenges. (573) REVEAL, A General Reverse Engineering Algorithm for Inference of GeneticNetwork Architectures S. Liang, S. Fuhrman and R. Somogyi; Pacific Symposium on Biocomputing 3:18-29 (1998).
    [Show full text]
  • Implementing a System to Enable Credit for Data Sharing April 9, 2018
    Implementing a System to Enable Credit for Data Sharing April 9, 2018 Association of American Medical Colleges 655 K Street, N.W., Washington, D.C. 20001 ATTENDEE LIST George Alter Gil Alterovitz Joyce Backus Research Professor Presidential Innovation Fellow Associate Director ICPSR White House Presidential Innovation National Library of Medicine Fellow Program Jacquelyn Bendall Gordon Bernard Barbara Bierer Director, Research Compliance Executive Vice President for Research Faculty Director Council on Governmental Relations Vanderbilt University MRCT Center Richard Bookman Brian Bot Philip Bourne Director and Senior Advisor Principal Scientist Director & Professor University of Miami Sage Bionetworks University of Virginia Timothy Clark Merce Crosas Jacques Demotes-Mainard Associate Professor Chief Data Science and Technology Director General University of Virginia Officer, IQSS ECRIN Harvard University Anurupa Dev Jeffrey Drazen Rebecca English Lead Specialist, Science Policy Editor-in-Chief Program Officer Association of American Medical New England Journal of Medicine National Academies Colleges David Fearon Megan Force Maryrose Franko Sr. Data Management Specialist Editor, Data Citation Index Executive Director Johns Hopkins University Clarivate Analytics Health Research Alliance Jason Gerson Cynthia Grossman Janet Hall Senior Program Officer Director Director, Precision Medicine PCORI FasterCures Platform American Heart Association Robert Hanisch Anna Hatch Denise Hills Director, Office of Data and DORA Community Manager Director,
    [Show full text]
  • AFP / Biosapiens 2008 SIG Meeting Program Toronto, Canada, July 2008
    AFP / Biosapiens 2008 SIG Meeting Program Toronto, Canada, July 2008 Dear AFP / Biosapiens 2008 attendee, Welcome to the joint Automated Function Prediction / Biosapiens meeting in ISMB 2008. This is the second time AFP and Biosapiens are joining forces to bring you an engaging and stimulating program. As usual, we strive to bring you the latest in cutting-edge research in computational gene and protein function prediction and annotation delivered by leading international researchers. This year we are holding a joint session with the 3D SIG on predicting function from protein structure. “Structure to Function” is a problem that is rapidly moving to the forefront of life science due to the increasing number of unannotated structures coming from structural genomics projects. We are also addressing a need voiced by the community to learn more about computational function prediction. Kimmen Sjölander from the University of California Berkeley will conduct a workshop on function prediction using phylogenomics. Yanay Ofran from Bar Ilan University, Israel and Predrag Radivojac from Indiana University will present a variety of components, tools and techniques for computational function prediction. Those tutorials are intended for researchers and students that are entering the field, and for those in the field to gain more knowledge and expertise. This is a rare opportunity to interact informally with leading experts in the field and enhance your own research. We would like to thank the members of the Program Committee for carefully reviewing the abstracts sent to this meeting. Thanks to the International Society for Computational Biology for hosting this meeting at ISMB 2008. A special thanks to Prof.
    [Show full text]