Prediction of Host-Microbe Interactions from Community High- Throughput Sequencing Data

Prediction of Host-Microbe Interactions from Community High- Throughput Sequencing Data

University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 12-2017 Prediction of Host-Microbe Interactions from Community High- Throughput Sequencing Data Joshua Michael Stough University of Tennessee, [email protected] Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss Recommended Citation Stough, Joshua Michael, "Prediction of Host-Microbe Interactions from Community High-Throughput Sequencing Data. " PhD diss., University of Tennessee, 2017. https://trace.tennessee.edu/utk_graddiss/4838 This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a dissertation written by Joshua Michael Stough entitled "Prediction of Host-Microbe Interactions from Community High-Throughput Sequencing Data." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Doctor of Philosophy, with a major in Microbiology. Steven W. Wilhelm, Major Professor We have read this dissertation and recommend its acceptance: Jill A. Mikucki, Andrew D. Steen, Erik R. Zinser Accepted for the Council: Dixie L. Thompson Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) Prediction of Host-Microbe Interactions from Community High-Throughput Sequencing Data A Dissertation Presented to for the Doctor of Philosophy Degree The University of Tennessee, Knoxville Joshua Michael Albert Stough December 2017 Copyright © 2017 by Joshua M. A. Stough All rights reserved. ii Acknowledgements First, I would like to thank my advisor and mentor, Dr. Steven Wilhelm, for giving me the freedom to pursue research the way I enjoy it most, and for applying pressure in just the right ways to keep me moving forward in spite of failure or lack of progress. To my committee, thank you for your advice and patience over the years. Finishing this document and this degree would not be possible without the contributions each of you has made during my time here. I would also like to extend my thanks to the many members of our department who have made this all possible. To Dr. Mohammed Moniruzzaman, Eric Gann, and Sam Coy, whose expertise in giant viruses has always informed and advised the direction of my research. To Dr. Gary LeCleir for the training, patience, and support in the lab. To Chelsi Cassilly and Joseph Jackson for their friendship and moral support. Lastly, I would like to thank the members of my family that have provided encouragement, advice, and care over the ten years I have spent in schools preparing for this. None of this would be possible without the care, support, and patience of my wife. Likewise, my parents have always encouraged me to follow this path, one to which I have aspired since the first grade. Thank you all for standing with me through the best and worst of it. iii Abstract Microbial ecology is a diverse field, with a broad range of taxa, habitats, and trophic structures studied. Many of the major areas of research were developed independently, each with their own unique methods and standards, and their own questions and focus. This has changed in recent decades with the widespread implementation of culture-independent techniques, which exploit mechanisms shared by all life, regardless of habitat. In particular, high-throughput sequencing of environmentally isolated DNA and RNA has done much to expand our knowledge of the planet’s microbial diversity and has allowed us to explore the complex interplay between community members. Additionally, metatranscriptomic data can be used to parse relationships between individual members of the community, allowing researchers to propose hypotheses that can be tested in a laboratory or field setting. However, use of this technology is still relatively young, and there is a considerable need for broader consideration of its pitfalls, as well as the development of novel approaches that allow those without a computational background or with fewer resources to navigate its challenges and reap its rewards. To address these needs, we have developed targeted computational approaches that simplify next-generation sequencing datasets to a more manageable size, and we have used these techniques to address specific questions in environmental ecosystems. In a dataset sequenced for the purpose of identifying ecological factors that drive Microcystis aeruginosa to dominate cyanobacterial harmful algal blooms worldwide, we used a targeted approach to predict replication and lysogenic dormancy in bacteriophage. We used RNA-seq data to characterize viral diversity in the Sphagnum peat bog microbiome, identifying a wealth of novel viruses and proposing several host-virus pairs. We were able to assemble and describe the genome of a freshwater giant virus as well as that of a virophage that may infect it, and we used our techniques to describe its activity in publicly available datasets. Lastly, we have extended our efforts into the realm of medicine where we showed the influence exerted by the mouse gut microbiome on the host immune response to malaria, identifying several genes that may play a key role in reducing disease severity. iv Table of Contents CHAPTER I: INTRODUCTION 1 Part I: Literature Review: Microbial Ecology in the Age of Next-Generation Sequencing and the ‘omics data revolution 2 Part II: Linking Sequencing data to Microbial interactions 20 CHAPTER II: MOLECULAR PREDICTION OF LYTIC VS LYSOGENIC STATE FOR MICROCYSTIS PHAGE: METATRANSCRIPTOMIC EVIDENCE OF LYSOGENY DURING LARGE BLOOM EVENTS 27 Publication Note 28 Abstract 29 Introduction 29 Materials and Methods 32 Results 36 Discussion 41 Acknowledgments 49 Chapter II Appendix 50 CHAPTER III: NOVEL VIRUSES WITHIN SPHAGNUM PEAT IDENTIFIED BY HIGH- THROUGHPUT TRANSCRIPT SEQUENCING DATA 58 Publication Note 59 Abstract 60 Introduction 61 Materials and Methods 64 Results 67 Discussion 81 Acknowledgments 87 Chapter III Appendix 88 CHAPTER IV: GENOME OF CHRYSOCHROMULINA PARVA VIRUS AND ITS CONSTITUENT VIROPHAGE 90 Publication Note 91 Abstract 92 Introduction 93 Materials and Methods 95 v Results 98 Discussion 106 Acknowledgements 111 CHAPTER V: FUNCTIONAL CHARACTERISTICS OF THE GUT MICROBIOME IN C57BL/6 MICE DIFFERENTIALLY SUSCEPTIBLE TO PLASMODIUM YOELII 112 Publication Note 113 Abstract 114 Introduction 115 Materials and Methods 117 Results 122 Discussion 132 Acknowledgements 138 Chapter V Appendix 139 CHAPTER VI: CONCLUSIONS 144 LIST OF REFERENCES 148 VITA 181 vi List of Tables Chapter II Table 2.1 Lake Tai Sample Identification Numbers 50 Table 2.2 Environmental Data collected from Sample Sites 52 Table 2.3 Environmental Data collected from Sample Sites 54 Table 2.4 Reads Mapped to Reference Genomes 55 Table 2.5 Reference Sequence Accession Numbers 56 Table 2.6 Statistically Indistinguishably Expressed Gene Groups 57 Chapter III Table 3.1 Giant Virus Clusters of Orthologous Groups 88 Table 3.2 RNA Virus Reference Genes and Accessions 89 Chapter V Table 5.1 Mouse and Bacterial Genes Differentially Expressed in Different Mouse Strains 140 Table 5.2 Metabolome Statistics 143 vii List of Figures Chapter I Figure 1.1 Cost of Sequencing versus Moore’s Law 17 Chapter II Figure 2.1 Lytic & Lysogenic gene expression by station 37 Figure 2.2 Tail Sheath, transposase, and recombinase co-expression 38 Figure 2.3 Ma-LMM01 whole genome expression 40 Figure 2.4 Environmental contribution to whole genome expression 42 Figure 2.5 Phylogenetic distribution of IS607 reads 46 Chapter III Figure 3.1 Phylogenetic placement of Gp23 contigs 69 Figure 3.2 Phylogenetic placement of Gp20 contigs 70 Figure 3.3 Phylogenetic placement of RecA contigs 71 Figure 3.4 Phylogenetic placement of RNR contigs 72 Figure 3.5 Phylogenetic placement of ssRNA virus RDRP contigs 74 Figure 3.6 Genome architecture of RNA virus genome fragments 76 Figure 3.7 Phylogenetic placement of Giant Virus MCP contigs 77 viii Figure 3.8 Phylogenetic placement of Virophage contigs 78 Figure 3.9 Correlation co-occurrence networks 80 Chapter IV Figure 4.1 CpV ORF BLAST hit distribution 99 Figure 4.2 Maximum likelihood phylogenetic tree of conserved giant virus genes 100 Figure 4.3 CpV whole genome alignment and ORF prediction 102 Figure 4.4 Maximum likelihood phylogenetic tree of virophage conserved genes 104 Figure 4.5 Virophage genomic architecture comparison 105 Figure 4.6 Relative viral abundance in Lake Tai 2013 107 Chapter V Figure 5.1 Parasitemia progression in C57BL/6mice 123 Figure 5.2 Relative abundance of bacterial phyla 125 Figure 5.3 NMDS analysis of bacterial communities 126 Figure 5.4 Relative abundance of SEED subsystems functional categories 129 Figure 5.5 Volcano plot of bacterial differentially expressed genes 131 Figure 5.6 Volcano plot of mouse differentially expressed genes 132 ix Figure 5.7 Heatmap showing metabolite abundances 134 x CHAPTER I: INTRODUCTION 1 Part I: Literature Review: Microbial Ecology in the Age of Next-Generation Sequencing and the ‘omics data revolution Abstract Microbial ecology is a diverse field, with a broad range of taxa, habitats, and trophic structures studied. Many of the major areas of research

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    193 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us