In-Silico Guided Identification of Ciliogenesis Candidate Genes in a Non-Conventional Animal Model
Total Page:16
File Type:pdf, Size:1020Kb
Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2019 In-silico guided identification of ciliogenesis candidate genes in a non-conventional animal model Natalia I. Acevedo Luna Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Bioinformatics Commons Recommended Citation Acevedo Luna, Natalia I., "In-silico guided identification of ciliogenesis candidate genes in a non- conventional animal model" (2019). Graduate Theses and Dissertations. 17382. https://lib.dr.iastate.edu/etd/17382 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. In-silico guided identification of ciliogenesis candidate genes in a non-conventional animal model by Natalia Acevedo Luna A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Bioinformatics and Computational Biology Program of Study Committee: Heike Hofmann, Co-major Professor Geetu Tuteja, Co-major Professor Matthew Hufford Dennis V. Lavrov Mohan Gupta The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 ii DEDICATION This dissertation would never have been possible without the support and love of the wonderful people I have the honor of calling my family, they are my core and my every- thing. First and foremost, I am deeply beholden to my parents Pablo Acevedo and Gladys Luna who supported me in every possible aspect, have never once doubted me, and gave me the strength to continuously push myself further in my professional and personal development. In the same way, I owe gratitude to my siblings Boris and Luciana for their continuous encouragement, patience, and care throughout the years. They are not only my brother and sister but also the best friends. Thank you for always being there for me! Expressing my affection, appreciation, and devotion for my husband, Jan Hoinka, in words will never do justice to how deep my feelings for him truly are. His love, his unshat- tered believe in me, his honesty, and his persistent moral support and care have made me a better person and given me the strength to persist, both professionally and on a personal level, every single day. Above all, I dedicate this thesis to the memory of my beloved Mother. Her example of life, perseverance, care, dedication, love, and her overall philosophy on life are deeply rooted in me and present in every single moment, shaping who I am and who I am still to become. This thesis is dedicated to her. iii TABLE OF CONTENTS Page LIST OF FIGURES vi LIST OF TABLES viii ACKNOWLEDGEMENTS ix ABSTRACT xi CHAPTER 1. GENERAL INTRODUCTION 1 Motile and Non-motile Cilia 2 Cilium: A Conserved Organelle 4 Multiciliated Cell Differentiation 5 Transcriptional Regulation by FoxJ1 and Rfx 6 Platynereis dumerilii as an Animal Model 8 Structure of this Dissertation 10 Bibliography 13 CHAPTER 2: PDUMBASE 17 Abstract 17 Background 18 Construction and Content 23 Transcriptome Assembly and Annotation Pipeline 24 Expression Analysis 26 Comparative Transcriptome 27 Utility and Discussion 28 Annotation Search 28 Sequence Similarity Search 29 Comparative Analysis 29 Co-expression Information 31 Database Implementation 33 Additional Features 33 Conclusions 33 List of Abbreviations 34 Declarations 35 Availability of Data and Materials 35 Competing Interests 35 Funding 35 iv Authors’ Contributions 35 Acknowledgements 36 Bibliography 36 CHAPTER 3: CILIOGENESIS 45 Abstract 45 Background 46 Results 50 Identification of Known Ciliary Genes in P. dumerilii 50 Candidate Genes by β-Catenin Cell Fate Transformation 56 High Confidence Ciliogenesis Candidate Genes in P. dumerilii 60 Ciliogenesis Candidate Genes by Co-expression Analysis 66 Known Ciliary Genes by Localization and Functional Domains 75 Ciliogenesis Precursors 83 Discussion 86 Identification of Ciliogenesis Precursors and Transcriptional Regulators 88 Potential Novel Candidates Genes Based on Co-expression Analysis 90 Conclusion 96 Methods 97 Compilation of a Comprehensive Set of Known Ciliary Genes 97 Identification of Known Ciliary Genes Conserved in P. dumerilii 98 Classifying Known Ciliary Genes by Localization and Functional Domains 98 P. dumerilii Culture, Azakenpaullone Inhibitor Treatment, and Sequencing 99 Read Processing and Differential Expression Analysis 100 Identification of Ciliogenesis Candidate Genes 101 Expected Fold-change of Expression for Ciliary Genes 102 Identification of Ciliogenesis Candidate Genes by Annotation 103 Identification of Ciliogenesis Candidate Genes by Co-expression Analysis 103 Data and Availability 105 Bibliography 105 CHAPTER 4: DENDROSHINY 114 Abstract 114 Availability 115 Introduction 115 Materials and Methods 119 Data Input 119 Data Prepossessing 119 Cluster Generation and Classification 119 The Web Interface 120 Case Study 123 Effect of Tree Cutoff in Clustering Results 124 Effect of Timepoint Selection in Clustering Results 126 Discussion and Conclusion 128 Bibliography 131 v CHAPTER 5. GENERAL CONCLUSIONS 134 Overview 134 Future Directions 136 PdumBase Expansion 136 Ciliogenesis Candidate Genes Outlook 136 Towards a Ciliogenesis Gene Regulatory Networks 137 Concluding Remarks 140 APPENDIX A. PDUMBASE 141 Supplementary Figures 141 PdumBase Manual 143 APPENDIX B. CILIOGENESIS 172 Supplementary Figures 172 Supplementary Tables 175 Bibliography 272 vi LIST OF FIGURES Page Figure 1.1 General structure of an Eukaryotic motile cilium 3 Figure 1.2 Overview of computational analysis pipeline 11 Figure 2.1 Schematic illustration of the PdumBase sitemap 25 Figure 2.2 PdumBase search interface and search result options 30 Figure 2.3 PdumBase expandable results option 32 Figure 3.1 Candidate genes by β-Catenin cell fate transformation 57 Figure 3.2 Predicted effect of cell fate transformation on treated embryos 59 Figure 3.3 Distribution of the ratio of change in gene expression 61 Figure 3.4 Co-expression of highest IC gene 69 Figure 3.5 Co-expression of high confidence ciliogenesis candidate genes 74 Figure 3.6 Candidate genes classified into structural components 77 Figure 3.7 Candidate genes classified into functional components 81 Figure 3.8 Potential Ciliogenesis precursors identified by co-expression 84 Figure 4.1 Data flow overview of DendroShiny. 118 Figure 4.2 Screen capture of the web interface of DendroShiny 121 Figure 4.3 Impact of tree cut parameter c in co-expression results 126 Figure 4.4 Exploration of the clustering neighbourhood with Dendroshiny 127 Figure 4.5 Impact of time point selection in co-expression results 129 Figure A.1.1 Results from PdumBase Result Interface: Co-expression 141 Figure A.1.2 Results from PdumBase Result Interface: Gene models 142 Figure A.2.3 PdumBase Search result interface 146 Figure A.2.4 PdumBase Expression data tab interface 147 Figure A.2.5 PdumBase Search results interface: Uniprot 147 vii Figure A.2.6 PdumBase Search results interface: Extension 148 Figure A.2.7 PdumBase Annotation tab interface 150 Figure A.2.8 PdumBase Search result interface: Expression profile 151 Figure A.2.9 PdumBase Plot tab interface 152 Figure A.2.10 Heat map of 13,160 expressed genes 153 Figure A.2.11 PdumBase Coexpression information interface 154 Figure A.2.12 PdumBase Ortholog expression profile interface 155 Figure A.2.13 PdumBase List tab interface 158 Figure A.2.14 PdumBase Ortholog groups interface 159 Figure A.2.15 PdumBase Alignment tab interface 159 Figure A.2.16 PdumBase Search interface 160 Figure A.2.17 PdumBase Search interface 160 Figure A.2.18 PdumBase Search result interface 161 Figure A.2.19 PdumBase Plot tab 162 Figure A.2.20 PdumBase Expression data tab 163 Figure A.2.21 PdumBase Annotation tab 164 Figure A.2.22 PdumBase Search results interface 165 Figure A.2.23 PdumBase Search results interface 165 Figure A.2.24 PdumBase Search results interface 166 Figure A.2.25 PdumBase Search results interface: Show other info 166 Figure A.2.26 PdumBase Search results interface: Show other info selected 167 Figure A.2.27 PdumBase Search results interface 168 Figure A.2.28 PdumBase Search results interface 169 Figure A.2.29 PdumBase Search interface 170 Figure A.2.30 PdumBase Search results interface 171 Figure B.1.1 Algorithmic Pipeline to Identify by Sequence Similarity 172 Figure B.1.2 Pipeline to Identify Ciliogenesis Candidate Genes 173 Figure B.1.3 Cover Art for Manuscript 174 viii LIST OF TABLES Page Table 3.1 Description of sources of known ciliary genes 51 Table 3.2 Known ciliary genes classified into strict and inclusive core of high confidence genes 54 Table 3.3 Summary of Known ciliary genes discriminated into sub-sets 55 Table 3.4 Guilt by Association by gene name 65 Table 3.5 Ciliogenesis candidate genes by co-expression analysis 70 Table 3.6 Classification of known ciliary genes into functional and struc- tural components 76 Table A.2.1 Time points from Early Stages data set 144 Table A.2.2 Late Stages included in data set 145 Table A.2.3 Species and number of protein sequences for comparative analysis 155 Table A.2.4 Number of orthologs genes between the 6 species 156 Table A.2.5 Species and number of genes used to find orthologs groups 157 Table B.2.1 Compilation of Known Ciliary Genes 176 Table B.2.2 P. dumerilii Transcripts Aligning to Multiple Known Ciliary Genes 260 Table B.2.3 Strict Core of Known Ciliary Genes 271 ix ACKNOWLEDGEMENTS I would like to thank my major professor Dr.