Expansion of Microbial Virology by Impetus of the Reduction of Viral Dark Matter Siddharth Ravindran Krishnamurthy Washington University in St
Total Page:16
File Type:pdf, Size:1020Kb
Washington University in St. Louis Washington University Open Scholarship Arts & Sciences Electronic Theses and Dissertations Arts & Sciences Summer 8-15-2017 Expansion of Microbial Virology by Impetus of the Reduction of Viral Dark Matter Siddharth Ravindran Krishnamurthy Washington University in St. Louis Follow this and additional works at: https://openscholarship.wustl.edu/art_sci_etds Part of the Allergy and Immunology Commons, Immunology and Infectious Disease Commons, Medical Immunology Commons, and the Virology Commons Recommended Citation Krishnamurthy, Siddharth Ravindran, "Expansion of Microbial Virology by Impetus of the Reduction of Viral Dark Matter" (2017). Arts & Sciences Electronic Theses and Dissertations. 1221. https://openscholarship.wustl.edu/art_sci_etds/1221 This Dissertation is brought to you for free and open access by the Arts & Sciences at Washington University Open Scholarship. It has been accepted for inclusion in Arts & Sciences Electronic Theses and Dissertations by an authorized administrator of Washington University Open Scholarship. For more information, please contact [email protected]. WASHINGTON UNIVERSITY IN ST. LOUIS Division of Biology and Biomedical Sciences Immunology Dissertation Examination Committee: David Wang, Chair Gaya Amarasinghe Gautam Dantas Michael Diamond Daved Fremont S Josh Swamidass The Reduction of Viral Dark Matter by Expanding of Viral Diversity by Siddharth Ravindran Krishnamurthy A dissertation presented to The Graduate School of Washington University in partial fulfillment of the requirements for the degree of Doctor of Philosophy August 2017 Saint Louis, MO Table of Contents List of Figures …………………………………………………………................. iii List of Tables …………………………………………………………................. iv List of Abbreviations …………………………………………………………................. v Acknowledgments …………………………………………………………................. vi Abstract …………………………………………………………................. ix Chapter 1 Introduction………………………………………………………. 1 Chapter 1 References………………………………………………………... 16 Chapter 2 Hyperexpansion of RNA bacteriophage diversity……………….. 21 Chapter 2 References……………………………………………………….. 53 Chapter 3 Identification of a novel bunya-like virus using DiscoVir, an alignment independent k-mer algorithm for viral sequence classification……………………………………………………... 56 Chapter 3 References……………………………………………………….. 94 Chapter 4 Identifying RNA Phage Resistance Genes……………………..... 96 Chapter 4 References……………………………………………………….. 105 Chapter 5 Picobirnaviruses are a possible novel family of RNA bacteriophages…………………………………………………… 106 Chapter 5 References……………………………………………………….. 118 Chapter 6 Conclusions and Future Directions………………………………. 120 Chapter 6 References………………………………………………………... 125 Appendix I List of SRA Files Scanned for RNA Bacteriophages……………. 127 Appendix II Taxonomy Breakdown for DiscoVir Analysis…………………... 164 Appendix III List of SRA Files Searched for Novel Discovirus Sequences…… 166 Curriculum Vitae …………………………………………………………................. 355 ii List of Figures Figure 1.1 Generic overview of alignment based viral sequence identification……………………………... 3 Figure 1.2 Assembly can identify more reads from novel viruses than direct alignment……………………. 7 Figure 1.3 The distribution of viral genomes in RefSeq GenBank as defined by their host species………… 8 Figure 1.4 A presence/absence plot of viral families that infect specific host types from the various Baltimore classification groups…………………………………………………………………… 8 Figure 2.1 Phylogenetic analyses of novel RNA bacteriophages discovered in metagenomic sequencing datasets……………………………………………………………………………………………. 34 Figure 2.2 Unique characteristics of novel RNA bacteriophage. ……………………………………………. 38 Figure 2.3 Presence/absence heatmaps of RNA bacteriophage prevalence based on metagenomic sequencing in Rhesus Macaque Study 1. ………………………………………………………… 42 Figure 2.4 Maximum-likelihood phylogenetic analysis……………………………………………………… 43 Figure 3.1 Performance of DiscoVir compared to BLASTx in a simulated experiment for identifying novel viral sequences. …………………………………………………………………………………... 63 Figure 3.2 Simulated experiment using DiscoVir and BLASTx with comparisons that involve Proteobacteria and Terrabacteria highlighted…………………………………………………….. 64 Figure 3.3 Simulated experiment using DiscoVir and BLASTx with comparisons that involve Carmotetraviridae, Virgaviridae, and Idaeovirus highlighted…………………………………..... 64 Figure 3.4 Simulated experiment using DiscoVir and BLASTx with each fungal taxa highlighted………… 65 Figure 3.5 Early retrieval rate was measured for the simulated experiment using CROC curves…………… 66 Figure 3.6 Phylogeny and genome organization of Discovirus……………………………………………… 68 Figure 3.7 Discovirus containing Penicillium colony morphology on SDA plate from serum of a patient with acute febrile illness. ………………………………………………………………………… 70 Figure 3.8 Penicillium discovirus containing fungus phylogenetically clades among Penicillium species…. 71 Figure 3.9 Detection of Discovirus genome segments in isolated Penicillium………………………………. 73 Figure 3.10 Penicillium discovirus is maintained for multiple generations of Penicillium…………………… 75 Figure 4.1 RPKM abundance of metagenomic fragments in each selection round. ………………………… 99 Figure 4.2 Qβ, MS2 and C-1 infection curves with overexpressed metagenomic fragments………………... 100 Figure 4.3 Efficiency of plating of Qβ, MS2, fd and C-1 with overexpressed metagenomic fragments…….. 101 Figure 5.1 Genome organization of Human Picobirnavirus………………………………………………….. 108 Figure 5.2 Alignment of RBS sequences among known and novel picobirnaviruses……………………….. 110 Figure 5.3 RBS sequence enrichment among eukaryotic DNA and RNA viral sequences and bacteriophages……………………………………………………………………………………. 111 Figure 5.4 Schematic of initial HPBV reverse genetics system……………………………………………… 111 Figure 5.5 Transmission electron microscopy of the overexpressed picobirnaviral segments. ……………... 112 Figure 5.6 Schematic of low copy HPBV reverse genetics system………………………………………….. 113 Figure 5.7 HPBV2 WT Segment 2 dos not decrease slower than the catalytically dead mutant…………….. 113 Figure 5.8 Schematic of HPBV constructs containing chloramphenicol resistance genes…………………... 114 iii List of Tables Table 2.1 List of novel ssRNA phage partial genomes………………………………………………………... 28 Table 2.2 List of novel dsRNA bacteriophage partial genomes……………………………………………….. 33 Table 2.3 Number of ssRNA bacteriophage species based on varying amino acid identity cut-offs………….. 35 Table 2.4 Nucleotide alignment identities of AVE001 amplicons from the screening of Rhesus Macaque Study 1. ……………………………………………………………………………………………… 43 Table 2.5 Nucleotide alignment identities of AVE000 amplicons from the screening of Rhesus Macaque Study 1. ……………………………………………………………………………………………… 44 Table 2.6 List of primers used to confirm partial genomes by Sanger sequence……………………………….. 48 Table 3.1 Optimization of DiscoVir parameters. ………………………………………………………………. 60 Table 3.2 Assembled contig prediction by DisoVir…………………………………………………………….. 69 Table 3.3 Penicillium discovirus like sequences found in other fungi and plants……………………………… 76 Table 3.4 Penicillium discovirus like contigs assembled from SRA samples………………………………….. 83 Table 3.5 Primers……………………………………………………………………………………………….. 90 iv List of Abbreviations Abbreviation Full Name BHI Blood Heart Infusion CAT Chloramphenicol Acetyl Transferase CLASH cross-linking, ligation and sequencing hybrids contigs Contiguous Sequence CROC Concentrated ROC dsDNA Double-Stranded DNA dsRNA Double-Stranded RNA HPBV Human Picobirnavirus ICTV International Committee on the Taxonomy of Viruses ITS Internal Transcribed Spacer LB Lysogeny Broth NBC Naïve Bayes Classifier NCBI National Center for Biotechnology Information NEPRC New England Primate Research Center nr Non-redundant protein database nt Non-redundant nucleotide database ORF Open Reading Frame OTU Operational Taxonomic Unit RACE Rapid amplification of cDNA ends RBS Ribosomal Binding Site RdRp RNA dependent RNA Polymerase ROC Receiver Operator Characteristic SDA Sabouraud Dextrose Agar SIV Simian Immunodeficiency Virus SRA Sequence Read Archive ssDNA Single-stranded DNA ssRNA Single-stranded RNA SVM Support Vector Machine TEM Transmission Electron Microscopy TNPRC Tulane Primate Research Center v Acknowledgments I have been lucky to have an amazing thesis committee. They have provided me invaluable insight into all my scientific projects as well as my personal professional development. I am truly grateful for the time Dr. Josh Swamidass spent with me deepening my knowledge of machine learning and providing absolutely critical insight on how to develop DiscoVir; this project would not have been possible without his help and has been truly trajectory altering for me. Discussions with Dr. Daved Fremont has broadened my view on how to capitalize on the genetic diversity of large, dsDNA viruses to understand biology, and I plan to use this viewpoint to inform my future projects. Dr. Michael Diamond provided me with critical resources to perform mouse experiments dealing with RNA bacteriophages and as well as an intellectual framework by which to examine host genes in the context of multiple viruses rather than a single virus. Dr. Gautam Dantas was extremely gracious in connecting me with members with his lab that were familiar with functional metagenomic screens, as well as providing the initial libraries to perform this screen; the ability to work on this project has also been truly trajectory altering. Dr. Gaya Amarasinghe