The Structure, Function and Evolution of the Extracellular Matrix: a Systems-Level Analysis
Total Page:16
File Type:pdf, Size:1020Kb
The Structure, Function and Evolution of the Extracellular Matrix: A Systems-Level Analysis by Graham L. Cromar A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Molecular Genetics University of Toronto © Copyright by Graham L. Cromar 2014 ii The Structure, Function and Evolution of the Extracellular Matrix: A Systems-Level Analysis Graham L. Cromar Doctor of Philosophy Department of Molecular Genetics University of Toronto 2014 Abstract The extracellular matrix (ECM) is a three-dimensional meshwork of proteins, proteoglycans and polysaccharides imparting structure and mechanical stability to tissues. ECM dysfunction has been implicated in a number of debilitating conditions including cancer, atherosclerosis, asthma, fibrosis and arthritis. Identifying the components that comprise the ECM and understanding how they are organised within the matrix is key to uncovering its role in health and disease. This study defines a rigorous protocol for the rapid categorization of proteins comprising a biological system. Beginning with over 2000 candidate extracellular proteins, 357 core ECM genes and 524 functionally related (non-ECM) genes are identified. A network of high quality protein-protein interactions constructed from these core genes reveals the ECM is organised into biologically relevant functional modules whose components exhibit a mosaic of expression and conservation patterns. This suggests module innovations were widespread and evolved in parallel to convey tissue specific functionality on otherwise broadly expressed modules. Phylogenetic profiles of ECM proteins highlight components restricted and/or expanded in metazoans, vertebrates and mammals, indicating taxon-specific tissue innovations. Modules enriched for medical subject headings illustrate the potential for systems based analyses to predict new functional and disease associations on the basis of network topology. This study iii also explores the evolutionary forces that guided the development of the ECM. Analyses of domain conservation patterns in ECM proteins, including the use of a novel framework for identifying non-contiguous, conserved arrangements of domains shows most are of pre- deuterostome origin. Many participate in novel domain arrangements in vertebrates suggesting the sampling of new domain combinations was an important mechanism leading to neofunctionalization of paralogous ECM genes. Distinct types of proteins and/or the biological systems in which they operate may have influenced the types of evolutionary forces that drive protein innovation. This emphasizes the need for rigorously defined systems to address questions of evolution that focus on specific systems of interacting proteins such as the ECM. Finally, overviewing the current state of our knowledge of the ECM, this study addresses important gaps and highlights areas worthy of further investigation. iv Acknowledgments This project would not have been possible without the loving support and understanding of many family and friends, most notably my wife Judith Moses. The opportunity to pursue a career in research is a privilege and I am very grateful for the many ways that I experienced support along the way from so many of you. In particular, I would like to acknowledge my father-in-law, Nelu Moses, who sadly could not be here to celebrate this achievement. I know he would have been among the first to do so. His enthusiasm will be long cherished. I thank the members of my supervisory committee including Gary Bader, Andrew Emili and Johanna Rommens. I am grateful for their kind support and advice at committee meetings. A more thoughtful and helpful project committee has surely never been struck! These accolades extend obviously to my supervisor, John Parkinson, who was brave enough to take on a student older than he was and I am glad to count him a colleague and friend. Particular thanks go to Fred Keeley for his ardent support and infectious enthusiasm. I thank Sylvie Richard-Blum for kindly hosting me in her laboratory in Lyon, France while working on the manuscript corresponding to the first data chapter. I also thank Emilie Chautard for the generous contribution of her time to the first paper and Megan Miao who kindly provided the recombinant elastin peptides used in the SPR analysis. Thanks to Shoshanna Wodak, Andrei Turinski, Shuye Pu, Brian Turner and other members of the Wodak lab for their comments and contributions during joint lab meetings and other discussions. Zhaolei Zhang provided knowledgeable advice on the analysis of sequences. Ka-chun Wong contributed significantly to the analysis of sequential patterns used in the second data chapter on domain architectures. James Wasmuth was a source of welcome distraction and the most likeable critic anyone could ask for. He, along with Xuejian Xiong, Hongyan (Bill) Song, Tuan On and Noeleen Loughran collaborated extensively on the development of pipelines, databases and programming. Most of the amazing things I learned to do with spreadsheets were the magic of Stacy Hung. I thank David He and Gabe Musso for influential advice in the early stages of my studies. It is with pleasure that I dedicate this thesis to a former science teacher, D. Witucki who reminded me at a formative stage of my education never to count myself out. This has turned out to be a most inspirational mantra! v Table of Contents Acknowledgments.......................................................................................................................... iv Table of Contents .............................................................................................................................v List of Tables ...................................................................................................................................x List of Figures ................................................................................................................................ xi List of Appendices ....................................................................................................................... xiii List of Supporting Data Files ....................................................................................................... xiv List of Abbreviations ................................................................................................................... xvi Chapter 1 The Extracellular Matrix .................................................................................................1 1 Background .................................................................................................................................1 1.1 Overview ..............................................................................................................................1 1.2 Discovery and Early History of the Matrix..........................................................................2 1.3 Defining Matrix Components ..............................................................................................7 1.3.1 Collagens..................................................................................................................8 1.3.2 Elastin and Elastic Fibres .........................................................................................8 1.3.3 Proteoglycans ...........................................................................................................9 1.3.4 Glycoproteins ...........................................................................................................9 1.3.5 Cell surface receptors .............................................................................................10 1.3.6 ECM associated growth factors .............................................................................11 1.3.7 Modifiers of ECM structure and function ..............................................................11 1.3.8 Identifying additional matrix proteins ...................................................................11 1.4 Evolution of the ECM and its components ........................................................................12 1.5 Structure / Function............................................................................................................16 1.5.1 Self Assembly ........................................................................................................16 vi 1.5.1.1 Collagen Fibre Assembly ........................................................................17 1.5.1.2 Elastic Fibre Assembly ............................................................................17 1.5.2 Tissue-specific Expression.....................................................................................20 1.5.3 Post Translational Modifications ...........................................................................22 1.5.3.1 Effects on solubility .................................................................................22 1.5.3.2 Biomineralization ....................................................................................22 1.5.3.3 Activation/Inactivation by cleavage ........................................................23 1.5.3.4 Modification of GAGs .............................................................................23 1.5.3.5 Bioactive Fragments ................................................................................23 1.5.3.6 Discovery of novel post-translational modifications ...............................24 1.5.4 Role in Development