The Organization and Evolution of Biological Networks in Bacteria
Total Page:16
File Type:pdf, Size:1020Kb
The Organization and Evolution of Biological Networks in Bacteria Elucidating Biological Pathways and Complexes Involved in Bacterial Survival and Environmental Adaptation by Cedoljub Bundalovic-Torma A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Biochemistry University of Toronto © Copyright by Cedoljub Bundalovic-Torma 2018 ii The Organization and Evolution of Biological Networks in Bacteria: Systematic Exploration of Large-Scale Biological Networks in E. coli and Exopolysaccharide Biosynthetic Machineries. Cedoljub Bundalovic-Torma Doctor of Philosophy Department of Biochemistry University of Toronto 2018 Abstract Bacteria inhabit diverse environmental niches and employ various functional repertoires encoded in their genomes relevant for survival and adaptation. It has long been proposed that gene duplication plays an important role in bacterial adaptation, however systematic experimental study of the functional roles of duplicates in bacteria has been lacking. From decades of small- scale experimental work with the model bacterium, Escherichia coli, our view of the bacterial cell has expanded to encompass the concept of biological networks, a wiring diagram of the cell representing diverse functional associations of genes. To help define these complex functional relationships, large-scale genetic and protein interaction screens have been devised and applied to E. coli, providing datasets for deriving novel and biologically meaningful associations. In this work I present a study of gene duplication in the context of several large-scale biological networks. First (Chapter 2) I investigate two recently published E. coli genetic interaction (GI) networks, and find that duplicates are likely to contribute to increased robustness through epistatic buffering or integration in the context of biological pathways and protein complexes with broad biological roles1 and DNA damage and repair response pathways2. Next (Chapter 3) I further investigate the implications of gene duplication in the context of physical protein iii interactions, based on a recent mapping of cell-envelope complexes generated for E. coli3, revealing several instances where the acquisition of novel physical interactions has likely led to neofunctionalization of duplicates involved in environmental adaptation. Finally (Chapter 4) in a first systematic survey of biofilm secretion machineries, I combine a novel phylogenetic clustering approach and genome proximity networks to identify the impact of duplication on the evolution of operons, as phyla-specific niche adaptations. In summary, this work expands our knowledge of how the rewiring of physical, epistatic, and genomic associations of duplicates has shaped biological pathways in bacteria of adaptive significance. iv Acknowledgments This work would not have been possible without the guidance of my supervisor, Dr. John Parkinson, as well as the contributions of collaborators who have provided invaluable opportunities for me to accomplish the work presented in this thesis. Above all, I would like to thank my parents, Zoran Torma and Branka Bundalovic, and my dear friend Mike Travis for their unwavering support during my studies. v Table of Contents Abstract ........................................................................................................................................... ii Acknowledgments.......................................................................................................................... iv Table of Contents .............................................................................................................................v List of Tables ................................................................................................................................. ix List of Figures ..................................................................................................................................x List of Appendices ....................................................................................................................... xiv List of Supplemental Files .............................................................................................................xv Chapter 1 Background .....................................................................................................................1 1 Charting Cellular Complexity in Bacteria...................................................................................1 1.1 Overview ..............................................................................................................................1 1.2 Escherichia coli as a Model Organism of Gram-negative Bacterial Biology .....................2 1.3 Biological Networks – Wiring Diagrams of Biological Processes ......................................3 1.3.1 What are Biological Networks? ...............................................................................3 1.3.2 Protein Interaction Networks ...................................................................................5 1.3.3 Genetic Interaction Networks ..................................................................................8 1.3.4 Computationally Derived Functional Interaction Networks Inferred Through Genomic Context Methods ....................................................................................10 1.3.5 Integrative Approaches to Boosting Reliability of Physical Interactions with Predicted Functional Assignments .........................................................................12 1.4 Comparative Genomics and its Application Toward the Study of Biological Networks ..13 1.4.1 What is Comparative Genomics? ...........................................................................13 1.4.2 Bacterial Genomic Evolution .................................................................................13 1.4.3 Defining the Evolutionary Relationships of Genes – Orthology and Paralogy .....14 1.4.4 Computational Approaches for Functional Annotation of Protein Sequences from Sequence Homology .....................................................................................15 vi 1.4.4.1 Sequence Homology Based on Conservation of Structural Domains: Protein Families ………………………………………………………... 16 1.4.4.2 Sequence Homology Based on Inference of Evolutionary Relationships: Gene Families ………………………………………………………….. 17 1.4.5 Application of Biological Networks and Comparative Genomics Approaches to Study the Evolution of Bacterial Biological Processes .....................................17 1.4.5.1 Robustness and Evolvability …………………………………………… 19 1.4.5.2 Defining Modularity in Biological Networks ………………………….. 20 1.4.5.3 Functional and Evolutionary Modularity ………………………………. 21 1.5 Project Goals and Rationale ...............................................................................................24 Chapter 2 Investigation of the Evolution of Diverse Biological Pathways and Functional Divergence of Paralogs in E. coli Genetic Interaction Networks .............................................26 2 E. coli Genetic Interaction Networks ........................................................................................27 2.1 Materials and Methods .......................................................................................................28 2.1.1 Source of Datasets..................................................................................................29 2.1.2 Enrichment of Genetic Interactions in Functional Modules ..................................31 2.1.3 Determination of Evolutionary Co-Conservation of Broad-GI Genes using Mutual Information of Phylogenetic Profiles ........................................................32 2.2 Results ................................................................................................................................34 2.2.1 E. coli Functional Modules Demonstrate an Enrichment in Genetic Interactions .............................................................................................................34 2.2.2 Phylogenetic Conservation of GI Networks Provide Evolutionary Insights into the Functional Integration and Divergence of Biological Processes .....................44 2.3 Discussion and Conclusions ..............................................................................................60 Chapter 3 Investigation of the Organization of Physical Complexes and Functional Divergence of Paralogs in the E. coli Cell Envelope ................................................................65 3 The E. coli Cell Envelope Protein-Protein Interaction Network...............................................66 vii 3.1 Materials and methods .......................................................................................................67 3.1.1 Sources of Data: E. coli Cell Envelope Associated Proteome Physical Interaction Network ...............................................................................................67 3.1.2 Prediction and benchmarking of predicted E. coli CE-PPI protein complexes with Markov Clustering .........................................................................................68 3.1.3 Integration of CE-PPI and previously published E. coli Genetic Interaction Networks ................................................................................................................68 3.1.4 Analysis of Functional Divergence of Paralogs by Differences in PPI Overlap