Genome Based Analyses Reveals the Presence of Heterotypic Synonyms
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.12.13.418756; this version posted December 13, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Genome based analyses reveals the presence of heterotypic 2 synonyms and subspecies in Bacteria and Archaea 3 Munusamy Madhaiyan1†, Venkatakrishnan Sivaraj Saravanan,2† & Wah-Seng See-Too3† 4 1Temasek Life Sciences Laboratory, 1 Research Link, National University of Singapore, 5 Singapore 117604 6 2Department of Microbiology, Indira Gandhi College of Arts and Science, Kathirkamam 605009, 7 Pondicherry, India 8 3Division of Genetics and Molecular Biology, Institute of Biological Sciences, Faculty of 9 Science, University of Malaya, Kuala Lumpur, Malaysia 10 †These authors have contributed equally to this work 11 Keywords: dDDH; AAI; genome-based taxonomy; 16S rRNA gene similarity; polyphasic taxonomy; 12 OGRI 13 Abbreviations: dDDH, digital DNA: DNA hybridization; ANI, average nucleotide identity; OGRI, 14 overall genome related indices 15 Abstract 16 Term heterotypic synonym refers to different names have been associated with different type 17 strains, however from the opinion of a bacteriologist, different names belongs to the same taxon 18 and term subspecies refers to strains and genetically close organisms that were diverging 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.13.418756; this version posted December 13, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 19 phenotypically. In this study, sequenced and publicly available genomes in the Edgar 2.0 server 20 were carefully analysed and based on high (>98 %) amino acid identity value, synonyms were 21 putatively identified. The 16S rRNA gene sequence of those species were used for the 22 construction of maximum likelihood based phylogenetic trees to infer the genetic closeness or 23 distance by examining the tree topology and clustering of the organisms within clades. They 24 were further subjected to overall genome related indices like digital DNA-DNA hybridization, 25 average nucleotide identity to confirm the presence of synonyms or subspecies with phenotypic 26 data support. The outcome of this polyphasic taxonomic re-analysis was identification of 40 later 27 heterotypic synonyms and 13 subspecies spread over phylum Actinobacteria, Bacteroidetes, 28 Firmicutes, Nitrospirae, Proteobacteria and Thermotogae and in domain Archaea. 29 INTRODUCTION 30 A taxon refers to one or more elements, it can be of any taxonomic category from class to 31 subspecies that are designated as nomenclatural type. The term type refers to the nomenclatural 32 type strain designated as per the ICNP rule, it’s the element of the taxon with which the name is 33 permanently associated as correct name or heterotypic synonym [1]. In simple terms, synonym 34 refers to the same taxon under another scientific name, they usually come in pair or even swarms 35 [2]. In old literature, they were named as objective synonyms and subjective synonyms which are 36 referred in the recent times as homotypic and heterotypic synonyms [1]. In nutshell, homotypic 37 synonym referring to more than one name associated with same type strain whereas heterotypic 38 synonyms refer to different names have been associated with different type strains but based on 39 the opinion of the bacteriologist concerned both names belong to the same taxon. 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.13.418756; this version posted December 13, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 40 Heterotypic synonyms were first used in conjunction with junior synonyms, in a study conducted 41 on Helicobacter type strains, where in Helicobacter nemestrinae [3] was identified as junior 42 heterotypic synonym of Helicobacter pylori [4], further, the presence of later heterotypic 43 synonyms were identified in different genera including Janibacter, Weissella and Streptomyces 44 [5, 6, 7]. Conventionally, heterotypic synonyms were proposed based on the 16S rRNA gene 45 analysis and DNA-DNA hybridization studies as in the case of Janibacter and Weissella [5, 6]. 46 For instance, employing DNA-DNA hybridization technique on 13 Streptomyces species, two 47 subspecies and 8 later heterotypic synonyms were identified [7]. A study provided polyphasic 48 evidence for heterotypic synonym using fluorescent amplified fragment length polymorphism, 49 DNA-DNA hybridization and API 50 CHL analysis on Lactobacillus ferintoshensis and 50 Lactobacillus parabuchneri sharing similar genetic, biochemical and physiological features, this 51 led to proposal of L. ferintoshensis as the later heterotypic synonym of L. parabuchneri [8]. 52 Another study focused on using of multilocus sequence typing analysis to identify heterotypic 53 synonyms, herein similarity of three housekeeping genes including pheS, rpoA and atpA were 54 used to propose Lactobacillus crypricasei as the later heterotypic synonym of Lactobacillus 55 acidipiscis [9]. In later studies, the heterotypic synonyms were identified when they attempted to 56 characterize certain isolated strains or serendipitously discovered when attempted to validate a 57 previously published strain [10, 11]. A new dimension in the study of heterotypic synonyms was 58 the suggestion of the use of average nucleotide identity (ANI), since the next generation 59 sequencing provided a rapid and cost effective way of obtaining the whole genome sequences, a 60 proposal for integrating the genomics as a polyphasic component of the Bacteria and Archaea 61 taxonomy and systematics was mooted out [12], ANI values and comparative genomic analysis 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.13.418756; this version posted December 13, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 62 had become an inevitable part in the proposal of heterotypic synonyms in Neisseria, 63 Deinococcus and Bacillus [13, 14, 15]. 64 The term subspecies denote the strains and genetically similar organism that are phenotypically 65 divergent [16, 17]. Previously, delineation of subspecies was based on qualitative measurement 66 of phenotypic characters rather than examining evolutionary distance or 16S rRNA gene 67 similarity. However, the current approach in the taxonomic assignment of subspecies was based 68 on the overall genome related indices such as digital DNA-DNA hybridization (dDDH) of 70-79 69 % [18]. 70 The advent of genome sequencing technologies was reflected in bacterial systematics mainly in 71 reclassification of the taxa at different hierarchy level, such kind of proposals were attempted in 72 phylum Actinobacteria, Bacteroidetes, and Alphaproteobacteria [19, 20, 21]. As an exemplar, 73 within the taxa of Actinobacteria, by calculating intergenomic dDDH values, 29 new later 74 heterotypic synonyms, 8 new subspecies were proposed [19], in the phylum Bacteroidetes 6 later 75 heterotypic synonyms and 5 new subspecies were proposed [20]. The recent analysis of the 76 phylum Alphaproteobacteria revealed the presence of 33 later heterotypic synonyms, 12 new 77 subspecies and 2 new species [21]. The present study focusses on whole genome-based analyses 78 to identify heterotypic synonyms in the sequenced and publicly available bacterial genomes. The 79 idea behind this work originated when species of certain publicly available genomes in the Edgar 80 2.0 server [22] showed high amino acid similarity when their genomes were compared. Previous 81 studies show that genome analysis clearly envisage that strains from same species share an 82 amino acid identity (AAI) of > 95 % [23]. Based on this notion, on total 12479 genomes 83 comprising of 548 genera housed within 227 families were examined. Different species showing 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.12.13.418756; this version posted December 13, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 84 > 98 % AAI similarity were assumed to contain heterotypic synonym or subspecies and they 85 were selected and used in the present study. Alternatively, genome results of certain recently 86 published type strains when critically examined, the possibility of heterotypic synonymy was 87 predicted. These putatively identified strains were also subjected to phylogenetic analysis, 88 overall genome related indices (OGRI) such as dDDH and average nucleotide identity (ANI) 89 analyses and their phenotypic traits were also considered to propose a holistic taxonomic 90 framework for the heterotypic synonyms and subspecies present in different genera. 91 METHODS 92 In this study, 95 type strains of different species of Bacteria and Archaea whose genome shared 93 > 98 % AAI were selected from the Edgar 2.0 server and certain recently published genome 94 contiaing putative heterotypic synonmy were downloaded from NCBI. Strains of these selected 95 genomes were distributed among phylum Actinobacteria, Bacteroidetes, Firmicutes, 96 Nitrospirae, Proteobacteria, Thermotogae and Domain Archaea, their genome details are 97 presented in Table