Massive Genome Decay and Expansion of Insertion Sequences Drive the 3 Evolution of a Novel Host-Restricted Bacterial Pathogen
Total Page:16
File Type:pdf, Size:1020Kb
1 Additional File 1 for: 2 Massive genome decay and expansion of insertion sequences drive the 3 evolution of a novel host-restricted bacterial pathogen 4 Gonzalo Yebra1,a, Andreas F Haag2,a, Maan M Neamah2,3,4, Bryan A Wee1, Emily J 5 Richardson1, Pilar Horcajo5, Sander Granneman6, María Ángeles Tormo-Más7,8, Ricardo de la 6 Fuente5, J Ross Fitzgerald1,*, José R Penadés2,7,9* 7 1The Roslin Institute, University of Edinburgh, Edinburgh, United Kingdom; 2Institute of Infection, 8 Immunity & Inflammation, University of Glasgow, Glasgow, United Kingdom; 3Faculty of Veterinary 9 Medicine, University of Kufa, Kufa, Iraq; 4Middle Euphrates Centre for Cancer and Genetic Research, 10 University of Kufa, Kufa, Iraq; 5Facultad de Veterinaria, Universidad Complutense de Madrid, Madrid, 11 Spain; 6Centre for Synthetic and Systems Biology, University of Edinburgh, Edinburgh, United Kingdom; 12 7Departamento de Ciencias Biomédicas, Facultad de Ciencias de la Salud, Universidad CEU Cardenal 13 Herrera, 46113 Moncada, Valencia, Spain; 8Severe Infection Group, Health Research Institute Hospital 14 La Fe, Valencia, Spain; 9MRC Centre for Molecular Bacteriology and Infection, Imperial College London, 15 SW7 2AZ, UK. 16 a These authors contributed equally. 17 * Corresponding authors: 18 JRF ([email protected]) 19 JRP ([email protected]) 20 21 Table of Contents 22 Additional information (page 2) 23 Supplementary Figures (page 7) 24 Figure S1. Phylogenetic tree of the integrase gene from representative examples of 25 Staphylococcus aureus prophages. 26 Figure S2. Phylogenetic network of representative examples of S. aureus Pathogenicity 27 Islands. 28 Figure S3. Expression of SaaPIMVF7-encoded vwb. 29 Figure S4. Schematic representation of IS loci selected for analysis 30 Figure S5. Presence of the IS does not affect expression of the downstream gene product 31 through active transcription. 32 Figure S6. Pairwise genome alignment of S. aureus subsp. anaerobius versus S. aureus 33 subsp. aureus. 1 34 Additional Information 35 S. aureus subsp. anaerobius MLST variability 36 Multilocus Sequence Typing (MLST) of the whole genome sequences revealed more variability 37 across strains than previously reported which also supports the split into two main clades of 38 this lineage. In contrast to previous results that ascribed all S. aureus subsp. anaerobius 39 isolates to a single, homogenous sequence type (ST1464) [1, 2], we found allelic variation in 40 two of the seven housekeeping genes (pta and tpi). The four samples from Sudan presented 41 an identical allelic profile, but it was different than the one deposited in the MLST database for 42 ST1464. Specifically, these four sequences (including the genome made available by Elbir and 43 collaborators [3]) carried the allele 502 in the yqiL gene instead of the allele 160 characteristic 44 of ST1464 in the MLST database. A sequence comparison showed that these two alleles differ 45 by one SNP. We found two tpi alleles in our sample set that were different in the Sudanese 46 and European clades (alleles 177 and 422 respectively) whereas three pta alleles were found 47 within the European clade. Particularly, a cluster of two Italian and one Danish samples was 48 ascribed to ST3756, previously represented in the MLST database by a Czech sheep isolate. 49 Pseudogenes found in S. aureus subsp. anaerobius 50 In the following paragraphs, we describe some of the most noteworthy pseudogenes found 51 according to their function, whose presence could explain the phenotypical peculiarities of 52 S. aureus subsp. anaerobius with respect to S. aureus subsp. aureus, especially regarding 53 virulence and metabolism. 54 Defence mechanisms. We found several pseudogenised oxidoreductases (Table S1 in 55 Additional File 2), which points to a deficiency in the protection from oxidative damage that 56 could explain the microaerophilic nature of S. aureus subsp. anaerobius. The catalase gene 57 was pseudogenised in all isolates, a feature of the S. aureus subsp. anaerobius genome 58 already reported [4] and one of the main differences between the two subspecies of S. aureus. 59 Catalase is regarded as a defensive mechanism against the oxygen radicals produced by 60 macrophages, and indeed the restoration of catalase activity increases resistance to H2O2 but 2 61 decreases the virulence in sheep, their natural hosts [5]. This suggests that the loss of catalase 62 activity plays an important role in host-adaptation. Other proteins involved in the evasion of 63 oxygen radicals such as sodA and sodM were, however, intact with a protein identity of 98% 64 and 99% with their homologs in S. aureus RF122, respectively. 65 Other pseudogenes related to defence mechanisms were transmembrane pumps (most 66 belonging to the ABC superfamily of transporters) involved in antibiotic resistance and fluoride 67 toxicity, and type I restriction enzymes. 68 Virulence factors. Several pseudogenes encoded proteins involved in the biosynthesis of 69 the capsular polysaccharide capsule. Although the S. aureus subsp. anaerobius genome 70 carries intact genes also involved in this process, the presence of these pseudogenes might 71 imply a deficient bacterial capsule. 72 Other pseudogenes encoding virulence factors were adherence factors (clumping factor B, 73 staphylococcal protein A), pore-inducing exoproteins (leukocidins lukD and lukE and γ- and α- 74 haemolysins), and the IgG-binding protein sbi. The coagulase gene was intact in all European 75 isolates but present as a pseudogene in the four Sudanese isolates. Contradictory results in 76 the literature have reported coagulase activity in Spanish and Sudanese isolates but not in 77 Kenyan or French ones [3, 6]. In addition, all S. aureus subsp. anaerobius isolates analysed 78 here carried an intact vwb gene in the SaaPIMVF7 which could influence the coagulation 79 phenotype. 80 Metabolism and energy production. Twenty-nine of the 164 pseudogenes present in all 81 isolates encoded enzymes involved in 12 amino acid metabolic pathways. The IMG annotation 82 tool revealed that, while the isolate RF122 is auxotrophic for the amino acids lysine, 83 phenylalanine, tyrosine, histidine and serine, Staphylococcus aureus subsp. anaerobius is 84 auxotrophic, in addition to those, for tryptophan, arginine and leucine. IMG also found intact 85 aerobic respiration pathways in S aureus subsp. anaerobius. This, together with the 86 compromised defence mechanisms discussed above, suggests that this bacterium is a 3 87 microaerophile due to a lack of response against oxidative stress rather than to an incapacity 88 of using aerobic respiration. 89 Fourteen and 6 pseudogenes encoded proteins involved in carbohydrate and lipid transport 90 and/or metabolism, respectively. Some of the pathways affected by the presence of 91 pseudogenes were metabolism of sugars (galactose, fructose, sucrose and mannose), 92 glycolysis/gluconeogenesis, pyruvate metabolism, and phosphotransferase system (PTS, a 93 major mechanism used for uptake of carbohydrates). One of the pseudogenes encoded the 94 acetyl-coenzyme A synthetase, which participates in many metabolic pathways. 95 Finally, enzymes involved in inorganic ion/coenzyme transport (specifically nickel, magnesium, 96 manganese, cobalt, iron and molybdenum) were also present as pseudogenes. 97 S. aureus subsp. anaerobius novel mobile genetic elements 98 Prophage. One novel prophage (ΦSaa1) belonging to the Siphoviridae family was found in all 99 isolates with a length 43.2 kb and a GC content of 33.4%, which encodes 74 proteins in most 100 cases (some isolates presented 73 due to the absence of a HNH endonuclease). The closest 101 prophage found by PHASTER based on nucleotide similarity (90.2%) was Φ2958PVL 102 (accession number NC_011344), a phage initially described in methicillin-resistant S. aureus 103 subsp. aureus isolates in Japan [7] and encoding the Panton Valentine leukocidin, which is 104 otherwise absent in ΦSaa1 (Fig. 1B). 105 We inferred gene-by-gene homology between ΦSaa1 and Φ2958PVL using Roary [8]. Based 106 on this comparison, 24 of the 74 genes in ΦSaa1 (32.4%) were split versions of 10 homologous 107 genes in Φ2958PVL, whereas 37 (50%) were intact with homologues in Φ2958PVL, and 13 108 ΦSaa1 genes (17.6%) were exclusive. Most of the latter genes belonged to the lysogeny 109 functional module. A phylogenetic tree constructed using FastTree v2.1.10 of integrase 110 sequences from representative S. aureus subsp. aureus phage lineages [9] revealed that the 111 ΦSaa1 integrase falls outside the major integrase groups (Fig. S1 in Additional File 1), with 4 112 Sa2int (the group to which the integrase of Φ2958PVL belongs) being its closest relative. The 113 nucleotide identity between the two integrases was 78%. 114 The only putative virulence factor carried by ΦSaa1 was the virulence-associated protein E, 115 virE, whose gene was also present as a pseudogene. 116 Pathogenicity island. We also identified one novel, 13 kb-long Staphylococcal Pathogenicity 117 Island (SaPI) present in all isolates, which encoded 21 proteins including copies of the 118 Staphylococcal complement inhibitor (SCIN) and the von Willebrand factor-binding protein 119 (vWbp) (Fig. 1B). vWbp has previously been demonstrated to show host-specific activity, being 120 able to coagulate ruminant plasma [10, 11]. This new SaPI (SaaPIMVF7) is inserted in the 121 groES-groEL site –attB type V, same genomic location as others such as SaPIov2 (ED133, 122 ST133), and SaPIbov3