Whole-Genome Comparison Between the Type Strain of Halobacterium Salinarum (DSM 3754T) and the Laboratory Strains R1 and NRC-1
Total Page:16
File Type:pdf, Size:1020Kb
Whole-genome comparison between the type strain of Halobacterium salinarum (DSM 3754T) and the laboratory strains R1 and NRC-1. Friedhelm Pfeiffer, Gerald Losensky, Anita Marchfelder, Bianca Habermann and Mike Dyall-Smith MicrobiologyOpen, 2019, accepted Note: This document contains Supplementary Tables for the above manuscript. The legends use some terms which are only defined in the context of that manuscript. Here is one paragraph from that manuscript which briefly describes the content of the tables. Correlation of protein-coding genes among Halobacterium salinarum strains 91-R6T, R1, and NRC-1. As an annotation principle, every gene encoding a protein on a matching genome segment in one strain must have a correlated gene in the other strain. The gene sets of the three strains have been correlated in detail (for details see Methods of the associated manuscript and Appendix A4 and A5). Proteins are classified as strain- specific only after validation by tBLASTn that they are not mere missing gene calls. Correlated proteins, encoded on the chromosome of strains 91- R6 and R1 are listed in Table S1 (1986 proteins). The corresponding proteins from strain NRC-1 are also listed. In addition, there are regions of very high similarity between the plasmids from strains 91-R6 and R1, there are plasmid encoded correlated proteins, which are listed in Table S2. Furthermore, Table S2 lists proteins which are encoded on a plasmid in strain R1 but in a strain-specific region of the chromosome from strain 91- R6. Chromosomally encoded proteins from strain-specific regions of 91-R6 are listed in Table S3. In several cases, a homolog exists in R1 but the genes are not positionally correlated. Such homologs are also listed in Table S3. Residual proteins which are specific for the chromosome of strain R1 are listed in Table S4. Plasmid encoded proteins specific for strain 91-R6 are listed in Table S5, while plasmid encoded proteins specific for strain R1 are listed in Table S6. A few protein-coding genes exist in strain NRC-1 but are absent in both, strains 91-R6 and R1. These are listed in Table S7. Finally, Table S8 lists ORFs which are annotated in the current version of the NRC-1 genome (AE004437, AE004438, AF016485) but which are considered not to code for a protein (spurious ORFs; for definition see Appendix A4 in the associated manuscript).