bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Identification and capsular serotype sequetyping of Streptococcus pneumoniae strains
Lucia Gonzales-Silesa,b*, Francisco Salvà-Serraa,b,c,d, Anna Degermana, Rickard Nordéna, Magnus
Lindha, Susann Skovbjerga,b, Edward R. B. Moorea,b,d.
a Department of Infectious Diseases, Institute of Biomedicine, University of Gothenburg,
Gothenburg, Sweden
b Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg,
Sweden
c Microbiology, Department of Biology, University of the Balearic Islands, Palma de Mallorca,
Spain
d Culture Collection University of Gothenburg (CCUG), Department of Clinical Microbiology,
Sahlgrenska University Hospital, Gothenburg, Sweden
* Corresponding author
E-mail address: [email protected] (LG)
Post address: Guldhedsgatan 10A 41346 Gothenburg, Sweden
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
ABSTRACT
Correct identification of Streptococcus pneumoniae (pneumococcus) and differentiation from the closely
related species of the Mitis group of the genus Streptococcus, as well as serotype identification, is important
for monitoring disease epidemiology and assessing the impacts of pneumococcal vaccines. In this study, we
assessed the taxonomic identifications of 422 publicly available genome sequences of S. pneumoniae, S.
pseudopneumoniae and S. mitis, using different methods. Identification of S. pneumoniae, by comparative
analysis of the groEL partial sequence, was possible and accurate, whereas S. pseudopneumoniae and S.
mitis could be misclassified as S. pneumoniae, suggesting that groEL is unreliable as a biomarker for
differentiating S. pneumoniae from its closest related species. The genome sequences of S. pneumoniae and
S. pseudopneumoniae fulfilled the suggested thresholds of average nucleotide identity (ANI), i.e., >95%
genome sequence similarity to the sequence of respective type strains for identification of species, whereas
none of the S. mitis genome sequences fulfilled this criterion. However, ANI analyses of all sequences
versus all sequences allowed discrimination of the different species by clustering, with respect to species
type strains. The in silico DNA-DNA distance method was also inconclusive for identification of S. mitis
genome sequences, whereas presence of the “Xisco” gene proved to be a reliable biomarker for S.
pneumoniae identification. Furthermore, we present an improved sequetyping protocol including two
newly-designed internal sequencing primers with two PCRs, as well as an improved workflow for
differentiation of serogroup 6 types. The proposed sequetyping protocol generates a more specific product
by generating the whole gene PCR-product for sequencing, which increases the resolution for identification
of serotypes. Validations of both protocols were performed with publicly available S. pneumoniae genome
sequences, reference strains at the Culture Collection University of Gothenburg (CCUG), as well as with
clinical isolates. The results were compared with serotype identifications, using real-time Q-PCR analysis,
as well as the Quellung reaction or antiserum panel gel-precipitation. Our protocols provide a reliable
diagnostic tool for taxonomic identification as well as serotype identification of S. pneumoniae.
Keywords: Streptococcus pneumoniae; serotype; sequetyping; Mitis group bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
INTRODUCTION
Streptococcus pneumoniae (pneumococcus) causes invasive and non-invasive disease, including
pneumonia, meningitis, sepsis, otitis media, sinusitis, among others, particularly in children under
the age of 5 years and the aged (Johnson et al., 2010), leading to approximately a million deaths
annually in children aged less than 5 years, globally (Collaborators, 2017). A characteristic feature
and the main virulence factor of S. pneumoniae is the polysaccharide capsule that enables the
bacterium to evade host defence mechanisms (Nelson et al., 2007) and which is the basis for
epidemiological categorization of pneumococcal isolates and strains into serotypes and serogroups
(Geno et al., 2015). To date, 97 different capsular serotypes within 46 serogroups of S. pneumoniae
have been identified on the basis of the biochemical structure of the capsular polysaccharide (Geno
et al., 2015).
Several pneumococcal vaccines, which differ according to the polysaccharide capsule composition,
have been developed. The first pneumococcal conjugate vaccine (PCV), licensed in 2000, covered
7 serotypes (PCV7: 14, 6B, 19F, 23F, 4, 9V, 18C) (Hicks et al., 2007), followed by PCV10 (PCV7
serotypes plus serotypes 1, 5, and 7F) in 2009 (Esposito and Principi, 2015), PCV13 (PCV10
serotypes plus serotypes 3, 6A, and 19A) in 2010 (Geno et al., 2015). A 15-valent conjugate
vaccine is currently in clinical trials, and includes also serotypes 22F and 33F (LeBlanc et al.,
2017). The pneumococcal polysaccharide vaccine (PPSV23) protects against 23 different capsular
types (1, 2, 3, 4, 5, 6B, 7F, 8, 9N, 9V, 10A, 11A, 12F, 14, 15B, 17F, 18C, 19A, 19F, 20, 22F, 23F
and 33F), and covers a high percentage of the types found in pneumococcal bloodstream infections.
The vaccine is widely used for adults who are considered to be at high risk, as well as in children
older than 2 years and at increased risk for pneumococcal disease (Diao et al., 2016). The use of bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
the conjugate vaccines have significantly reduced the burden of pneumococcal disease in many
populations. However, since vaccine introduction, “serotype replacement” has been observed, with
increases in the proportions of invasive and non-invasive disease caused by pneumococcal
serotypes not covered by the vaccines (Hicks et al., 2007; Weinberger et al., 2011).
S. pneumoniae serotype 6C is an example of an opportunistic increase in an infectious
pneumococcus through serotype replacement. Serotype 6C was described as a newly-recognized
serotype in 2007 (Mavroidi et al., 2004) and appears to have been rare in pre-vaccination
populations. However, since the introduction of PCV7, the incidence of serotype 6C in disease and
carriage has increased in diverse populations, worldwide (Loman et al., 2013). PCV7 contains
polysaccharide from the 6B serotype capsule and PCV13 later included capsular polysaccharide of
serotype 6A, although current vaccines do not extend protection to serotype 6C, which likely has
promoted the observed serotype replacement (Park et al., 2008). Such serotype transitions
demonstrate the importance of maintaining surveillance programs and clinical protocols that are
able to respond to the evolutionary plasticity of infectious disease.
The classical serotyping method, the Quellung reaction, is based on the reaction of serotype-
specific antisera with the corresponding capsule (Neufeld F, 1910). This method is time-consuming
and costly, requiring live, cultivable bacteria, and a high degree of expertise, to the point that few
laboratories are able to carry out the analyses. During the last decade, the nucleotide sequences of
the capsule polysaccharide synthesis (CPS) loci (cps), harbouring the genes responsible for
synthesis of the pneumococcal cell polysaccharide capsule, have been determined for all known
serotypes. Accordingly, DNA amplification-based methods targeting specific capsular synthesis
genes that allow differentiation of the serotypes have been developed, i.e., sequential multiplex bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
PCR and sequential real-time Q-PCR (Pai et al., 2006; Varghese et al., 2017). Recently, a PCR-
amplification and DNA-sequence-based typing method, ‘sequetyping’, was described targeting the
regulatory gene, cpsB, with a single multiplex PCR, enabling the amplifications of 84 serotypes
and sequencing of PCR-products, differentiating 46 of the 93 serotypes recognized at that time
(Leung et al., 2012).
As an important practical step, before initiating pneumococcal serotype identification, it is critical
to confirm the identification of S. pneumoniae and differentiate it from the other species of the
Mitis group of the genus Streptococcus (Kawamura et al., 1995). The most closely-related species
of S. pneumoniae are S. pseudopneumoniae and S. mitis. Sequencing of the 16S rRNA genes
identifies a cytosine nucleotide at position 203 as a pneumococcal sequence signature, with an
adenosine residue in all other species of the Mitis group (Scholz et al., 2012). Partial sequence
determinations of individual metabolic ‘housekeeping’ genes, as a multi-locus sequence analysis
(MLSA) (Bishop et al., 2009), continue to be widely used for identifying strains at the species
level; for Streptococcus, groEL, gyrB, rpoB and sodA have been described as biomarker “house-
keeping” genes for identification of the species in the Streptococcus genus (Glazunova et al., 2009;
Hoshino et al., 2005; Kawamura et al., 1999; Teng et al., 2002). Additionally, the recently
described “Xisco” gene, which has been demonstrated to be a unique biomarker for S. pneumoniae,
provides a new approach for confirming specific differentiation between S. pneumoniae and its
close relatives of the Mitis group (Salvà-Serra et al., 2017). Genome-based methods, such as
average nucleotide identity (ANI) and in silico DNA-DNA hybridization, are gaining recognition
as robust measurements of relatedness between strains, with potential in confirming phylogenetic
and taxonomic relationships of bacterial identification (Konstantinidis and Tiedje, 2005; Meier-
Kolthoff et al., 2013). bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
In this study, we present an improved workflow for pneumococcal serotype identification,
including subtyping within serogroup 6 by sequetyping, as well as S. pneumoniae species
confirmation and differentiation from closely related streptococcal species.
METHODS
Bacterial strains
One-hundred thirty-eight pneumococcus strains with identified serotypes were obtained from the
Culture Collection University of Gothenburg, Gothenburg, Sweden (CCUG), where they were
maintained in lyophilized state for long-term storage. The serotypes of these strains were
determined at the Statens Serum Institut in Copenhagen, Denmark, by the Quellung reaction
(Slotved et al., 2016), or at the Public Health Agency of Sweden, using an antiserum panel gel-
precipitation protocol (Jauneikaite et al., 2015). Additionally, 50 strains, isolated from blood and
cerebrospinal fluid samples during 2013 and 2014 and identified as S. pneumoniae at the Clinical
Microbiological Laboratory, Sahlgrenska University Hospital, Gothenburg, Sweden, were
included in the study. The strains isolated from clinical samples were stored in freeze-drying
medium (Fry and Greaves, 1951), at -70 °C.
Genome sequence data
A local database was created, including all genome sequences of S. pneumoniae (n=328) that were
available in GenBank (Benson et al., 2017) on the 14th March 2015, plus the type strain S.
pneumoniae NCTC 7465T (GenBank accession number: LN831051) and all genome sequences
that were available in GenBank on the 18th May 2016 for 14 other species of the Mitis group
(n=248): S. pseudopneumoniae (n=40), S. mitis (n=53), S. australis (n=2), S. cristatus (n=16), S. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
dentisani (n=2), S. gordonii (n=22), S. infantis (n=7), S. massiliensis (n=2), S. oralis (n=34), S.
parasanguinis (n=29), S. peroris (n=1), S. sanguinis (n=33), S. sinensis (n=1) and S. tigurinus
(n=6) (Jensen et al., 2016).
DNA extraction
The stored strains from CCUG or clinical samples were inoculated onto Blood Agar plates with
horse blood 5% (prepared at the Substrate Department, Clinical Microbiological Laboratory,
Sahlgrenska University Hospital), and incubated overnight at 36 °C with 5% CO2. DNA was
extracted, using a ‘heat-shock’ protocol (Welinder-Olsson et al., 2000). Briefly, an inoculating
loop-full of bacterial biomass was suspended and incubated in 100 μL Tris-EDTA buffer and 15
μL lysostaphin 0.05 μM (Sigma-Aldrich, St. Louis, MO, USA) at 37 °C for 1 hour. Subsequently,
10 μL of Proteinase K (Sigma-Aldrich, St. Louis, MO, USA) were added and the suspensions were
incubated for 2 hours at 56 °C. Finally, the samples were incubated at 95 °C for 10 minutes. After
incubation, samples were centrifuged at 17,900 x g, for 10 min. The supernatant containing
genomic DNA, was transferred to a new tube and stored at -20 °C.
For multiplex PCR analyses, bacterial DNA was extracted, using a MagNA Pure LC instrument
(Roche Diagnostics, Mannheim, Germany) and a Total Nucleic Acid Isolation kit (Roche
Diagnostics, Mannheim, Germany). The extracted DNA was eluted in 100 µl of elution buffer, and
stored at -20 °C, until real-time multiplex PCR-assays were performed.
Taxonomic identifications
Identifications of reference strains and strains isolated from clinical samples were determined by
PCR-amplification and sequence analysis of partial (757 bp) groEL gene, using primers, bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
StreptogroELd and StreptogroELr, as previously described (Glazunova et al., 2009). PCR-products
were purified and sequenced (GATC Biotech AG, Constance, Germany). The sequences were
compared with the groEL partial sequences of the type strains of the 20 validly published species
of the Mitis group of the genus Streptococcus, using BioNumerics software platform, version 7.5
(Applied Maths, Sint-Martens-Latem, Belgium). A strain was assigned to a given species if the
partial groEL sequence similarity value was above 96%. The strains were also analysed for
presence of the “Xisco” gene, using amplification-primers, Spne-CW-F2 and Spne-CW-R,
according to Salvà-Serra et al. (2017).
The taxonomic status of the 422 genome sequences of S. pneumoniae, S. pseudopneumoniae and
S. mitis included in the local data base were assessed by determining average nucleotide identity,
based on BLAST (ANIb) (Goris et al., 2007), using JSpeciesWS (Richter et al., 2016), against the
reference genome sequences of the type strains of the different species. Additionally, the matrix
obtained from ANIb similarities of all vs. all genome sequences was used to construct an ANIb-
based dendrogram, according to Gomila et al. (2015). Briefly, the matrix of ANIb values was used,
applying Pearson’s distance correlation and an average linkage construction (UPGMA hierarchical
clustering), using PermutMatrix software (Caraux and Pinloche, 2005). Finally, in silico DNA-
DNA distance values were calculated, using the Genome-to-Genome Distance Calculator (GGDC),
(ggdc.dsmz.de) (Meier-Kolthoff, 2013) and the recommended BLAST+ method. The GGDC
results shown are based on the recommended formula 2 (sum of all identities found in high-scoring
segment pairs (HSPs), divided by the overall HSP length), which is independent of genome size
and is, thus, robust when using draft genomes.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Modified Sequetyping protocol
The sequetyping protocol was based on analysis of the capsule polysaccharide synthesis cpsB
region (Leung et al., 2012). In order to obtain sufficient quality for the entire 1,061 bp segment,
two internal primers were designed, wzh-mid-F and wzh-mid1-R, generating two partly
overlapping sequences (Figure 1). The reaction mixture for PCR-assays comprised 0.1 to 10 ng of
DNA template, 1X Taq PCRMasterMix (Qiagen, Hilden, Germany), 1 μM concentration of each
amplification-primer, in a total volume of 25 μL. Primer sequences are listed in Supplementary
Table 1. PCR-amplification was achieved, with an initial cycle of 5 min denaturation at 95°C and
30 cycles of 30 s at 95°C for denaturation, 30 s at 55°C for primer-annealing and 90 s at 72°C for
primer-extension, with a final extension step at 72°C for 5 min. Amplicons were analysed by
electrophoresis in 1% agarose gel. Sequencing reactions were performed using the four primers
(Figure 1).
A database of reference sequences was created for comparative analyses, including the sequences
for each serotype listed by Bentley et al., (2006), as well as the complete cpsB region sequences
extracted from the 329 S. pneumoniae genome sequences. The PCR-amplicon nucleotide
sequences were analysed by similarity analysis, using BioNumerics software platform, version 7.5
(Applied Maths, Sint-Martens-Latem, Belgium). A strain was assigned to a given serotype if the
similarity value was higher than 99% and the similarity of the second highest match was, at least,
1% lower. If the similarity value was shared between two or more serotypes, it was reported as
multiple-matched serotypes.
BLASTN analyses of the cpsB region were also performed, with respect to the 248 non-
pneumococcal genome sequences, in order to determine if this region is present in other species of bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
the Mitis group. Only BLAST hits with an E-Value lower then 10e-5 were considered significant.
Additionally, the four sequetyping primers were also analysed, using BLASTN against the 248
non-pneumoniae genome sequences. Only matches covering the entire primer length (100%
coverage) and maximum 2 mismatches were considered to be positive.
Serotype identification by multiplex real-time PCR.
A multiplex real-time PCR, able to detect 40 different serotypes, was developed and applied. The
assay is based on a protocol published by Centers for Disease Control and Prevention (CDC) (da
Gloria Carvalho et al., 2010), and is similar to the real-time PCR subsequently developed by CDC
(Pimenta et al., 2013). A complete list of primers and probe sequences can be found in
Supplementary Table 2. The multiplex real-time PCR was performed in 384-well format in a Quant
Studio 6 Flex (Applied Biosystems, Carlsbad, CA). Each PCR consisted of a 20 µl reaction volume,
including 4 μl of template DNA, along with 1 µM of each of the forward and reverse primers, 0.85
µM of the probe, 10 µl of 2X Universal Master Mix for DNA targets (Applied Biosystems, Foster
City, CA, USA) and RNAase-free water. The Tecan Freedom EVO PCR setup workstation (Life
Sciences, Männedorf, Switzerland) was used to prepare the PCR assays in a 384-well plate. The
reaction conditions were as follows: one initial cycle at 46°C for two minutes, followed by
denaturation at 95°C for 10 minutes and 45 amplification cycles of 95°C for 15 s and 58°C for one
minute. Each multiplex performance was evaluated, using an internal control (cpsA) to verify the
presence of pneumococcal DNA in the sample, as well as two pUC57 plasmids containing each
PCR target amplicon for all serotype systems.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
RESULTS
Classification of database sequences
In order to re-evaluate the classifications of the genome sequences of S. pneumoniae, S.
pseudopneumoniae and S. mitis in the genome sequence database, ANIb analyses were performed,
wherein each genome sequence was compared to that of the type strain of each species of the Mitis
group and by comparisons of all genomes to all. The analyses showed that all 328 strains (type
strain excluded) listed as S. pneumoniae in the GenBank database were correctly identified as S.
pneumoniae, i.e., had similarity values greater than 95% (Rosselló-Móra and Amann, 2015). Of
them, 24 sequences exhibited ANIb similarity values ≥99% to the sequence of the type strain, 271
strains exhibited ≥98% similarity and 33 strains exhibited ≥97% similarity. By comparison, only 9
of the 39 sequences from strains listed as S. pseudopneumoniae (type strain excluded) in GenBank
exhibited ANIb values ≥95%, whereas 48 of the 52 sequences from strains listed as S. mitis (type
strain excluded) exhibited ANIb values below 95% and only four strains had ANIb values between
95-96%, indicating a significant number of misclassifications of strains for which genome
sequence data had been submitted to GenBank.
Additionally, cluster analyses was done, using the calculated ANIb similarity values for all strains
against all, for S. pseudopneumoniae and S. mitis, including, as well, the type strains of the other
13 species of the Mitis group included in the genomes database. With this analysis, a dendrogram
was generated, to visualize the relationships among the strains, with respect to the type strains of
the different species (Supplementary Figure 1). In the cases of S. pneumoniae, all genome
sequences clustered most closely with the type strain of S. pneumoniae, confirming the taxonomic
designations for the genome sequences. However, only nine of the 39 genome sequences listed as
S. pseudopneumoniae and thirty-six of the 52 genome sequences listed as S. mitis in the database bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
clustered in proximity to the type strain of the respective species and were, therefore, taxonomically
designated as S. pseudopneumoniae and S. mitis, while the remaining 46 strains clustered closer to
other species.
The GGDC analyses, comprising comparison with type strains for each species, showed that all S.
pneumoniae genome sequences had in silico hybridization values higher than 70%, confirming
their taxonomic identities. The GGDC analyses for S. pseudopneumoniae matched the results
obtained by ANIb analysis (Table 1), whereas only one of the genome sequences of S. mitis
exhibited a hybridization value higher than 70%; for the rest of the genome sequences, the in silico
DNA-DNA hybridization values were lower than 70% and inconclusive for confirming species-
level identifications (Table 2).
Partial groEL sequence analyses using the region of the gene suggested by Glazunova et al., (2009)
were also performed. The 757 bp groEL sequence was extracted from all genome sequences of S.
pneumoniae, S. mitis and S. pseudopneumoniae and similarity values of the sequences were
calculated, with respect to the type strains of the 14 species of the Mitis group of Streptococcus.
By this analysis of partial groEL sequences, with sequence similarities above 96% (cut-off value)
with the type strains of the respective species, all 328 genome sequences listed as representing S.
pneumoniae genomes (type strain excluded) were identified as S. pneumoniae, whereas only three
of 39 genomes listed as S. pseudopneumoniae (type strain excluded) were identified as S.
pseudopneumoniae (Table 1), and 28 of the 52 sequences listed as S. mitis (type strain excluded)
were identified as S. mitis. In four of the strains, the groEL gene could not be found in the genome
sequence (Table 2). The classifications of the remaining 17 genome sequences were ambiguous,
with non-definitive similarity values for S. mitis, as well as S. pneumoniae and S. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
pseudopneumoniae, allowing no clear species-level identifications (Table 2). Finally, the “Xisco”
gene was not detected in any genome sequence listed as S. mitis or S. pseudopneumoniae, whereas
the “Xisco” gene was present in all genome sequences identified as S. pneumoniae. A summary of
the results for the genome sequences that were taxonomically incorrect but listed in GenBank as S.
pseudopneumoniae and S. mitis are presented in Supplementary Table 3.
Based on these results, discrepancies were observed when comparing the results of identifications
of genome sequences obtained by genome sequence ANIb analysis and results obtained by partial
groEL sequencing, suggesting that groEL may not be as reliable a marker as anticipated for
identification of the closely related species of the Mitis group of the Streptococcus.
Identification of Streptococcus pneumoniae in culture collections and clinical samples
In cultivated and isolated clinical strains (n=50) as well as in the 138 strains from the CCUG
previously identified as S. pneumoniae, the “Xisco” gene was present in 100% of strains.
Furthermore, groEL similarity values were observed to be greater than 98% in all strains and
greater than 99% in two-thirds of the analysed strains, confirming by two independent techniques
that the strains were correctly identified as S. pneumoniae.
Serotype identification
The sequetyping technique of Leung et al. (2012) was modified by using two internal primers to
generate two partially overlapping amplicons, representing the whole 1,017 bp cpsB-region (Figure
1). To assess its accuracy, sequetyping was evaluated in silico by analysing cpsB sequences that
were extracted from the 329 genome sequences of S. pneumoniae in the local genome sequence
database. The serotypes of 261 (80%) of these genomes were identified, with similarity values bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
greater than 99% to a reference sequence. In 15 strains similarity matches were low and therefore
they could not be assigned to the serotype, 8 strains exhibited best matches to serotypes 10B-10C
(less than 97%) and 7 strains exhibited best matches to serotype 24F (less than 98%). The serotypes
of 52 genomes could not be determined. Thirty-six of these genomes were previously described as
‘non-typeable’ pneumococci (Hathaway et al., 2004) and, therefore, the cpsB sequence was not
present (Table 3). For the remaining 16 genomes, the serotypes could not be determined, due to
low similarity values, with respect to the reference sequences.
BLASTN analyses of the cpsB region, with respect to the 248 non-pneumoniae genome sequences,
gave 22 positive hits with E-Values lower than 10e-5 but with similarities ranging between 93 and
80% (Supplementary Table 4), suggesting that the cpsB region could also be present in other
species of the Mitis group. Analysis of the probability for the four sequetyping primers to amplify
among the 248 non-pneumoniae genome sequences considering a maximum of 2 mismatches
showed that the PCR including the primers cps1 - wzh-mid-R could lead to 2 positive
amplifications, whereas the PCR reaction including the primers wzh-mid-F – cps2 could lead to 13
positive amplifications. However, the whole cpsB region will be expected to be amplified in only
two cases, in S. mitis SK579 and S. mitis SK616 (Supplementary Table 4).
The sequetyping was applied on the 138 S. pneumoniae strains from the CCUG, for which the
serotypes had previously been determined by the Quellung reaction or the antiserum panel gel-
precipitation protocol. The determined sequences were analysed by BLAST searches, and
similarities were recorded. A sequence was assigned to a specific serotype if the similarity value
was greater than 99% and the next-best similarity match was, at least, 1% lower. In 140 strains
(97%), the serotype by sequetyping matched the results obtained by the reference methods. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Discrepancies were observed for five strains: CCUG 1749 (17A) was identified as 10A;
CCUG 5906 (36) was identified as 15B; CCUG 20653 (48) was identified as 6B; CCUG 27692
(15A) was identified as 19B; and CCUG 55117 (16A) was identified as 48.
Finally, the serotypes of 50 strains isolated from clinical samples were determined, using
sequetyping, multiplex real-time PCR and antiserum panel gel-precipitation protocol performed at
the Public Health Agency of Sweden. A serotype was identified by real-time PCR for 36 of the 50
strains (73%), whereas the serotypes were identified only by sequetyping for the remaining 14
strains (27%), showing serotypes that were not targeted by the real-time PCR assay. In all cases,
the obtained results agreed with those obtained by the antiserum panel protocol (Table 4).
Serogroup 6 differentiation
A dendrogram, based on the sequence of the entire cpsB region (1,017 bp) sequence from all strains
classified as serogroup 6, was created (Supplementary Figure 2). The sequences did not form
distinct clusters, indicating that serotype differentiation among serogroup 6 is not possible by
sequetyping of this region and that an alternative method is needed. A DNA sequencing-dependent
approach was used for differentiating the serotypes 6A, 6B, 6C and 6D; a schema of the suggested
protocol is shown in Figure 2. Firstly, a PCR-amplification, using the primers, wciP374F (this
study) and wciP-R (Jin et al., 2009), was performed and the PCR-product was sequenced, using
primer, wciP374F. Primer sequences are listed in Supplementary Table 1. The sequence product
allowed visualization of the single nucleotide polymorphism (SNP) that distinguishes serotype
6A/6C (guanine in position 584) and serotype 6B/6D (adenine in position 584). Subsequently, for
differentiating serotype 6A and 6C, a second PCR, using primers, Del6Cwzy_Fv2 (this study) and
Del6Cwzy_R (Jin et al., 2009), followed by Sanger sequencing, using primer Del6Cwzy_Fv2, was bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
performed. A 6 bp deletion in the gene wzy, characteristic for serotype 6C, was detected. The in
silico PCR analysis showed that it is possible to obtain PCR-products for serotypes 6A, 6B and 6C
but not for serotype 6D (Table 5), although, the deletion was present only in serotype 6C. For
differentiating serotype 6B from 6D, a PCR, using primers, wciN_6AB_F and wciN_6AB_R (this
study), was performed. This PCR is shown to be unique for serotype 6B; thus, if the PCR-product
was produced, the strain was assigned to serotype 6B, whereas, if the PCR was negative, the strain
was assigned to serotype 6D. To finally confirm serotype 6D, two additional PCR-assays, targeting
the wciNbeta region were performed, the first PCR, using primers, wciNbetaS1/ wciNbetaA2 (Jin et al.,
2009), and the second, using primers, wciNbetaS2/ wciNbetaA1 (Jin et al., 2009). If at least one of
the PCR-assays was positive, the strain was confirmed as serotype 6D. Details of the analysis are
presented in Table 5.
The sequence analyses performed for the target regions of the genomes showed that the regions
where the primers anneal are highly conserved; thus, PCR-amplification is expected to be specific
and reliable. The proposed protocol was tested in all CCUG strains identified as serogroup 6 by
Quellung reaction and in the clinical isolates identified as serogroup 6. Similar results were
obtained in 9 strains when the proposed protocol was tested, compared to the results obtained by
Quellung reaction, except for CCUG 3114, which was previously described as 6A and reclassified
as serogroup 6C.
DISCUSSION
Correct identifications of S. pneumoniae strains are crucial for choosing the proper treatment
options and for assessment of the burden of disease. As a general standard, routine culture-based
identification of S. pneumoniae consists of bile solubility and optochin susceptibility tests (Richter bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
et al., 2008). There is a percentage of isolates that give inconsistent results with optochin
susceptibility and bile solubility and are referred to as ‘atypical’ pneumococci. In addition, similar
biochemical properties are also present in a significant proportion of other closely related species
of the Mitis group, such as S. mitis, and S. pseudopneumoniae, especially when samples are from
respiratory sites (Keith et al., 2006; Rolo et al., 2013). The in silico analysis performed in this study
showed that identification of S. pneumoniae by analysis of the groEL partial sequence was possible
and reliable, whereas S. pseudopneumoniae and S. mitis could be misclassified as S. pneumoniae,
suggesting that groEL is an unreliable marker for differentiating S. pneumoniae from its closest
related species. In the studies of Glazunova et al. (2009) and Teng et al. (2002) (Glazunova et al.,
2009; Teng et al., 2002), where groEL was proposed to differentiate S. pneumoniae from other
species of the Mitis group, few strains of each species were used for the analysis; in our study, we
included all the genomes sequences available in the database at the time the study was performed.
These results point to the risk that partial gene sequence analysis may lead to misclassification, for
example, due to horizontal transfer. Horizontal gene transfer and homologous recombination,
involving groEL, between species most likely occurs, as has been suggested previously for sodA
and rpoB genes (Varghese et al., 2017).
High degrees of horizontal gene transfer and homologous recombination (Chi et al., 2007; Jensen
et al., 2016; Kilian et al., 2008) between S. pneumoniae and commensal viridans group streptococci
have given rise to genotypic ambiguities between S. pneumoniae and closely related species, such
as S. mitis, S. pseudopneumoniae and S. oralis (Kilian et al., 2008; Kilian et al., 2014; Whatmore
et al., 2000). Multi-Locus Sequence Analysis (MLSA) for the Viridans group streptococci
developed by Bishop et al. (2009) (Bishop et al., 2009) and core genome phylogenetic analyses
(Jensen et al., 2016) are genome-based techniques that can differentiate the Viridans group bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
streptococci to the species level. ANIb similarity determination is gaining relevance as a robust
measure of relatedness between strains, with potential in confirming phylogenetic and taxonomic
relationships of bacterial identification (Konstantinidis and Tiedje, 2005). An ANIb similarity
threshold above 95%, with respect to species reference type strains is proposed to provide species-
level identifications of given genomes (Kim et al., 2014; Richter and Rossello-Mora, 2009). In our
study, the genome sequences of S. pneumoniae and S. pseudopneumoniae fulfilled the suggested
thresholds, whereas only four of the S. mitis genome sequences fulfilled this criterion. However,
cluster analyses, derived from determined all vs. all genome sequences ANIb similarity values,
allowed discrimination of the different species, by clustering with respect to species type strains.
The in silico DNA-DNA hybridization calculated with the GGDC was also inconclusive for
identification of S. mitis genome sequences.
The recently described “Xisco” gene, which is detected by a single PCR, seems to be a good marker
for the correct identification of S. pneumoniae and differentiation from the closely related species
S. pseudopneumoniae and S. mitis (Salvà-Serra et al., 2017). Both in the in silico and in vitro
analyses, the “Xisco” gene was present in all S. pneumoniae strains and absent in all genomes and
strains of the non-pneumococcus Mitis group species. Other targets have been proposed to be
specific for pneumococci the last decade, such as pneumolysin (ply) (McAvin et al., 2001),
autolysin (lytA) (Corless et al., 2001), pneumococcal surface antigen A (psaA) (Morrison et al.,
2000), and penicillin binding protein (pbp) (O'Neill et al., 1999), among others. However, the
“Xisco” gene seems to be more robust and distinguishes S. pneumoniae from the other species of
the Mitis group more reliably. Since recombination in the Mitis group may occur, it is potentially
unreliable to use a single gene biomarker for identification of S. pneumoniae.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Identification of S. pneumoniae serogroup/serotype is important for surveillance of strains in
disease carriage and for strategies in vaccine development (O'Brien et al., 2009). For serotyping S.
pneumoniae, the Quellung reaction method is considered the ‘gold standard’, although it can be
performed only on viable isolates, needs expertise and is expensive. Recently, molecular
techniques, such as genotypic typing methods targeting serotype-specific regions of the cps genes,
including multiplex PCR (Brito et al., 2003; da Gloria Carvalho et al., 2010; Jourdain et al., 2011;
Pai et al., 2006; Richter et al., 2013) and multiplex real-time Q-PCR (Pimenta et al., 2013), have
been described. These methods allow the detection of multiple serotypes but are still relatively
laborious, considering than more that nearly 100 different serotypes are known today. Most of these
methods were designed to be able to identify the serotypes that have been included in vaccines or
which are most common in given geographic areas. However, in surveillance studies, replacement
of vaccine serotypes by non-vaccine serotypes has been reported in regions where pneumococcal
conjugate vaccines are implemented (Hicks et al., 2007; Weinberger et al., 2011), raising the
necessity for simplified methods that allow detection of as many serotypes as possible, as well as
recognition of newly-evolved serotypes,.
The recently described sequetyping technique by Leung et al. (2012), has the advantage of being
able to detect a broad range of serotypes in one analysis. However, in our hands it was difficult to
obtain adequate amplicons and, as pointed out by Leung et al., the size of the amplicon (1,061 bp)
is too large for the current Sanger sequencing protocols. Therefore, we added two internal primers
to amplify two fragments and sequence, enabling to obtain the whole cpsB sequence with good
quality. This strategy allows distinction between the serotypes 18B and 18C, but not differentiation
within the serogroups 6 (6A, 6B, 6C, and 6D) and 7 (7F and 7A). An advantage of sequetyping is
that it can be based on data from the publicly available GenBank database, although, the nature of bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
handling of data deposited with this database implies potential risk for incorrect assignment of
serotype designations, as well as incorrect taxonomic classifications of strains. We confirmed that
cpsB sequetyping gave correct serotype results by performing an in silico sequetyping of 329
available S. pneumoniae genomes and also a BLAST similarity analysis to further document the
accuracy of the sequetyping method.
The utility of DNA-based methods for serotyping can be limited, due to inherent difficulties of
differentiations within serogroups, which is of importance because available vaccines may include
some, but not all, serotypes of a serogroup. Recently, PCR-based protocols for improved
discrimination of serogroup 18 and serotypes 22F and 33F were described (Gillis et al., 2017;
Tanmoy et al., 2016). Here we present a modified protocol for discriminating serogroup 6
serotypes, based on sequence analysis., The distinction is important, given the significant increase
of pneumococcal infections of serotype 6C after introduction of conjugate vaccines.
The sequetyping was applied on 50 pneumococcal clinical isolates, which were also analysed by
real-time Q-PCR and serotyping by an antiserum panel at the Public Health Agency of Sweden.
The comparisons showed good agreement between the assays, similar to what was observed by
Dube et al. (2015) (Dube et al., 2015). The results confirmed that sequetyping was able to detect
also several non-vaccine serotypes. These genotypes were not detected by real-time Q-PCR
because this assay identifies only those serotypes that are specifically targeted. However, the use
of the sequetyping method is limited to single isolates due to difficulties to differentiate different
serotypes when analysing the sequence chromatograms. In contrast, the real-time Q-PCR can be
used with total DNA extracts from samples and is therefore, able to recognize the presence of
multiple serotypes in a given sample. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
In conclusion, the presence of the “Xisco” gene, genome sequence ANIb, in silico DNA-DNA
hybridization and targeted groEL comparative sequence analyses are reliable methods for
identification of pneumococci. Serotyping by using PCR- and DNA sequence-based methods is
highly useful in cases where access to traditional methods is limited and when cultivation of
isolates is negative. However, since S. pneumoniae and the related species of the Mitis group of
Streptococcus undergo constant recombination, the use of the different techniques needs to be
applied in order to verify the reliability of analyses.
ACKNOWLEDGMENTS
This work was supported by the European Commission: TAILORED-Treatment (project number
602860; www.tailored-treatment.eu). The Culture Collection University of Gothenburg (CCUG)
is supported by the Department of Clinical Microbiology, Sahlgrenska University Hospital and the
Sahlgrenska Academy of the University of Gothenburg. FS-S was supported by stipends for Basic
and Advanced Research from the CCUG, through the Institute of Biomedicine, Sahlgrenska
Academy, University of Gothenburg.
Declarations of interest: none.
REFERENCES
Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Ostell, J., Pruitt, K.D., Sayers, E.W.,
2017. GenBank. Nucleic acids research.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Bentley, S.D., Aanensen, D.M., Mavroidi, A., Saunders, D., Rabbinowitsch, E., Collins, M.,
Donohoe, K., Harris, D., Murphy, L., Quail, M.A., Samuel, G., Skovsted, I.C., Kaltoft, M.S.,
Barrell, B., Reeves, P.R., Parkhill, J., Spratt, B.G., 2006. Genetic analysis of the capsular
biosynthetic locus from all 90 pneumococcal serotypes. PLoS genetics 2, e31.
Bishop, C.J., Aanensen, D.M., Jordan, G.E., Kilian, M., Hanage, W.P., Spratt, B.G., 2009.
Assigning strains to bacterial species via the internet. BMC biology 7, 3.
Brito, D.A., Ramirez, M., de Lencastre, H., 2003. Serotyping Streptococcus pneumoniae by
multiplex PCR. Journal of clinical microbiology 41, 2378-2384.
Caraux, G., Pinloche, S., 2005. PermutMatrix: a graphical environment to arrange gene expression
profiles in optimal linear order. Bioinformatics 21, 1280-1281.
Chi, F., Nolte, O., Bergmann, C., Ip, M., Hakenbeck, R., 2007. Crossing the barrier: evolution and
spread of a major class of mosaic pbp2x in Streptococcus pneumoniae, S. mitis and S. oralis.
International journal of medical microbiology : IJMM 297, 503-512.
Collaborators, G.B.D.D.D., 2017. Estimates of global, regional, and national morbidity, mortality,
and aetiologies of diarrhoeal diseases: a systematic analysis for the Global Burden of Disease Study
2015. Lancet Infect Dis 17, 909-948.
Corless, C.E., Guiver, M., Borrow, R., Edwards-Jones, V., Fox, A.J., Kaczmarski, E.B., 2001.
Simultaneous detection of Neisseria meningitidis, Haemophilus influenzae, and Streptococcus bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
pneumoniae in suspected cases of meningitis and septicemia using real-time PCR. Journal of
clinical microbiology 39, 1553-1558.
da Gloria Carvalho, M., Pimenta, F.C., Jackson, D., Roundtree, A., Ahmad, Y., Millar, E.V.,
O'Brien, K.L., Whitney, C.G., Cohen, A.L., Beall, B.W., 2010. Revisiting pneumococcal carriage
by use of broth enrichment and PCR techniques for enhanced detection of carriage and serotypes.
Journal of clinical microbiology 48, 1611-1618.
Diao, W.Q., Shen, N., Yu, P.X., Liu, B.B., He, B., 2016. Efficacy of 23-valent pneumococcal
polysaccharide vaccine in preventing community-acquired pneumonia among immunocompetent
adults: A systematic review and meta-analysis of randomized trials. Vaccine 34, 1496-1503.
Dube, F.S., van Mens, S.P., Robberts, L., Wolter, N., Nicol, P., Mafofo, J., Africa, S., Zar, H.J.,
Nicol, M.P., 2015. Comparison of a Real-Time Multiplex PCR and Sequetyping Assay for
Pneumococcal Serotyping. PloS one 10, e0137349.
Esposito, S., Principi, N., 2015. Impacts of the 13-Valent Pneumococcal Conjugate Vaccine in
Children. J Immunol Res 2015, 591580.
Fry, R.M., Greaves, R.I., 1951. The survival of bacteria during and after drying. J Hyg (Lond) 49,
220-246.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Geno, K.A., Gilbert, G.L., Song, J.Y., Skovsted, I.C., Klugman, K.P., Jones, C., Konradsen, H.B.,
Nahm, M.H., 2015. Pneumococcal Capsules and Their Types: Past, Present, and Future. Clin
Microbiol Rev 28, 871-899.
Gillis, H.D., Demczuk, W.H.B., Griffith, A., Martin, I., Warhuus, M., Lang, A.L.S., ElSherif, M.,
McNeil, S.A., LeBlanc, J.J., 2017. PCR-based discrimination of emerging Streptococcus
pneumoniae serotypes 22F and 33F. J Microbiol Methods 144, 99-106.
Glazunova, O.O., Raoult, D., Roux, V., 2009. Partial sequence comparison of the rpoB, sodA,
groEL and gyrB genes within the genus Streptococcus. International journal of systematic and
evolutionary microbiology 59, 2317-2322.
Gomila, M., Pena, A., Mulet, M., Lalucat, J., Garcia-Valdes, E., 2015. Phylogenomics and
systematics in Pseudomonas. Front Microbiol 6, 214.
Goris, J., Konstantinidis, K.T., Klappenbach, J.A., Coenye, T., Vandamme, P., Tiedje, J.M., 2007.
DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.
International journal of systematic and evolutionary microbiology 57, 81-91.
Hathaway, L.J., Stutzmann Meier, P., Battig, P., Aebi, S., Muhlemann, K., 2004. A homologue of
aliB is found in the capsule region of nonencapsulated Streptococcus pneumoniae. Journal of
bacteriology 186, 3721-3729.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Hicks, L.A., Harrison, L.H., Flannery, B., Hadler, J.L., Schaffner, W., Craig, A.S., Jackson, D.,
Thomas, A., Beall, B., Lynfield, R., Reingold, A., Farley, M.M., Whitney, C.G., 2007. Incidence
of pneumococcal disease due to non-pneumococcal conjugate vaccine (PCV7) serotypes in the
United States during the era of widespread PCV7 vaccination, 1998-2004. The Journal of infectious
diseases 196, 1346-1354.
Hoshino, T., Fujiwara, T., Kilian, M., 2005. Use of phylogenetic and phenotypic analyses to
identify nonhemolytic streptococci isolated from bacteremic patients. Journal of clinical
microbiology 43, 6073-6085.
Jauneikaite, E., Tocheva, A.S., Jefferies, J.M., Gladstone, R.A., Faust, S.N., Christodoulides, M.,
Hibberd, M.L., Clarke, S.C., 2015. Current methods for capsular typing of Streptococcus
pneumoniae. J Microbiol Methods 113, 41-49.
Jensen, A., Scholz, C.F., Kilian, M., 2016. Re-evaluation of the taxonomy of the Mitis group of the
genus Streptococcus based on whole genome phylogenetic analyses, and proposed reclassification
of Streptococcus dentisani as Streptococcus oralis subsp. dentisani comb. nov., Streptococcus
tigurinus as Streptococcus oralis subsp. tigurinus comb. nov., and Streptococcus oligofermentans
as a later synonym of Streptococcus cristatus. International journal of systematic and evolutionary
microbiology 66, 4803-4820.
Jin, P., Xiao, M., Kong, F., Oftadeh, S., Zhou, F., Liu, C., Gilbert, G.L., 2009. Simple, accurate,
serotype-specific PCR assay to differentiate Streptococcus pneumoniae serotypes 6A, 6B, and 6C.
Journal of clinical microbiology 47, 2470-2474. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Johnson, H.L., Deloria-Knoll, M., Levine, O.S., Stoszek, S.K., Freimanis Hance, L., Reithinger,
R., Muenz, L.R., O'Brien, K.L., 2010. Systematic evaluation of serotypes causing invasive
pneumococcal disease among children under five: the pneumococcal global serotype project. PLoS
Med 7.
Jourdain, S., Dreze, P.A., Vandeven, J., Verhaegen, J., Van Melderen, L., Smeesters, P.R., 2011.
Sequential multiplex PCR assay for determining capsular serotypes of colonizing S. pneumoniae.
BMC Infect Dis 11, 100.
Kawamura, Y., Hou, X.G., Sultana, F., Miura, H., Ezaki, T., 1995. Determination of 16S rRNA
sequences of Streptococcus mitis and Streptococcus gordonii and phylogenetic relationships
among members of the genus Streptococcus. International journal of systematic bacteriology 45,
406-408.
Kawamura, Y., Whiley, R.A., Shu, S.E., Ezaki, T., Hardie, J.M., 1999. Genetic approaches to the
identification of the mitis group within the genus Streptococcus. Microbiology 145 ( Pt 9), 2605-
2613.
Keith, E.R., Podmore, R.G., Anderson, T.P., Murdoch, D.R., 2006. Characteristics of
Streptococcus pseudopneumoniae isolated from purulent sputum samples. Journal of clinical
microbiology 44, 923-927.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Kilian, M., Poulsen, K., Blomqvist, T., Havarstein, L.S., Bek-Thomsen, M., Tettelin, H., Sorensen,
U.B., 2008. Evolution of Streptococcus pneumoniae and its close commensal relatives. PloS one
3, e2683.
Kilian, M., Riley, D.R., Jensen, A., Bruggemann, H., Tettelin, H., 2014. Parallel evolution of
Streptococcus pneumoniae and Streptococcus mitis to pathogenic and mutualistic lifestyles. MBio
5, e01490-01414.
Kim, M., Oh, H.S., Park, S.C., Chun, J., 2014. Towards a taxonomic coherence between average
nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes.
International journal of systematic and evolutionary microbiology 64, 346-351.
Konstantinidis, K.T., Tiedje, J.M., 2005. Genomic insights that advance the species definition for
prokaryotes. Proceedings of the National Academy of Sciences of the United States of America
102, 2567-2572.
LeBlanc, J.J., ElSherif, M., Ye, L., MacKinnon-Cameron, D., Li, L., Ambrose, A., Hatchette, T.F.,
Lang, A.L., Gillis, H., Martin, I., Andrew, M.K., Boivin, G., Bowie, W., Green, K., Johnstone, J.,
Loeb, M., McCarthy, A., McGeer, A., Moraca, S., Semret, M., Stiver, G., Trottier, S., Valiquette,
L., Webster, D., McNeil, S.A., Serious Outcomes Surveillance Network of the Canadian
Immunization Research, N., 2017. Burden of vaccine-preventable pneumococcal disease in
hospitalized adults: A Canadian Immunization Research Network (CIRN) Serious Outcomes
Surveillance (SOS) network study. Vaccine 35, 3647-3654.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Leung, M.H., Bryson, K., Freystatter, K., Pichon, B., Edwards, G., Charalambous, B.M., Gillespie,
S.H., 2012. Sequetyping: serotyping Streptococcus pneumoniae by a single PCR sequencing
strategy. Journal of clinical microbiology 50, 2419-2427.
Loman, N.J., Gladstone, R.A., Constantinidou, C., Tocheva, A.S., Jefferies, J.M., Faust, S.N.,
O'Connor, L., Chan, J., Pallen, M.J., Clarke, S.C., 2013. Clonal expansion within pneumococcal
serotype 6C after use of seven-valent vaccine. PloS one 8, e64731.
Mavroidi, A., Godoy, D., Aanensen, D.M., Robinson, D.A., Hollingshead, S.K., Spratt, B.G., 2004.
Evolutionary genetics of the capsular locus of serogroup 6 pneumococci. Journal of bacteriology
186, 8181-8192.
McAvin, J.C., Reilly, P.A., Roudabush, R.M., Barnes, W.J., Salmen, A., Jackson, G.W., Beninga,
K.K., Astorga, A., McCleskey, F.K., Huff, W.B., Niemeyer, D., Lohman, K.L., 2001. Sensitive
and specific method for rapid identification of Streptococcus pneumoniae using real-time
fluorescence PCR. Journal of clinical microbiology 39, 3446-3451.
Meier-Kolthoff, J.P., Auch, A.F., Klenk, H.P., Goker, M., 2013. Genome sequence-based species
delimitation with confidence intervals and improved distance functions. BMC bioinformatics 14,
60.
Morrison, K.E., Lake, D., Crook, J., Carlone, G.M., Ades, E., Facklam, R., Sampson, J.S., 2000.
Confirmation of psaA in all 90 serotypes of Streptococcus pneumoniae by PCR and potential of
this assay for identification and diagnosis. Journal of clinical microbiology 38, 434-437. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Nelson, A.L., Roche, A.M., Gould, J.M., Chim, K., Ratner, A.J., Weiser, J.N., 2007. Capsule
enhances pneumococcal colonization by limiting mucus-mediated clearance. Infection and
immunity 75, 83-90.
Neufeld F, H.L., 1910. Weitere untersuchungen uber pneumokokken-heilsera. III. Mitteilung
Arbeiten aus em Kaiserlichen Gesundheitsamte 34, 293–304.
O'Brien, K.L., Wolfson, L.J., Watt, J.P., Henkle, E., Deloria-Knoll, M., McCall, N., Lee, E.,
Mulholland, K., Levine, O.S., Cherian, T., Hib, Pneumococcal Global Burden of Disease Study,
T., 2009. Burden of disease caused by Streptococcus pneumoniae in children younger than 5 years:
global estimates. Lancet 374, 893-902.
O'Neill, A.M., Gillespie, S.H., Whiting, G.C., 1999. Detection of penicillin susceptibility in
Streptococcus pneumoniae by pbp2b PCR-restriction fragment length polymorphism analysis.
Journal of clinical microbiology 37, 157-160.
Pai, R., Gertz, R.E., Beall, B., 2006. Sequential multiplex PCR approach for determining capsular
serotypes of Streptococcus pneumoniae isolates. Journal of clinical microbiology 44, 124-131.
Park, I.H., Moore, M.R., Treanor, J.J., Pelton, S.I., Pilishvili, T., Beall, B., Shelly, M.A., Mahon,
B.E., Nahm, M.H., Active Bacterial Core Surveillance, T., 2008. Differential effects of
pneumococcal vaccines against serotypes 6A and 6C. The Journal of infectious diseases 198, 1818-
1822. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Pimenta, F.C., Roundtree, A., Soysal, A., Bakir, M., du Plessis, M., Wolter, N., von Gottberg, A.,
McGee, L., Carvalho Mda, G., Beall, B., 2013. Sequential triplex real-time PCR assay for detecting
21 pneumococcal capsular serotypes that account for a high global disease burden. Journal of
clinical microbiology 51, 647-652.
Richter, M., Rosselló-Móra, R., 2009. Shifting the genomic gold standard for the prokaryotic
species definition. Proceedings of the National Academy of Sciences of the United States of
America 106, 19126-19131.
Richter, M., Rosselló-Móra, R., Oliver Glockner, F., Peplies, J., 2016. JSpeciesWS: a web server
for prokaryotic species circumscription based on pairwise genome comparison. Bioinformatics 32,
929-931.
Richter, S.S., Heilmann, K.P., Dohrn, C.L., Riahi, F., Beekmann, S.E., Doern, G.V., 2008.
Accuracy of phenotypic methods for identification of Streptococcus pneumoniae isolates included
in surveillance programs. Journal of clinical microbiology 46, 2184-2188.
Richter, S.S., Heilmann, K.P., Dohrn, C.L., Riahi, F., Diekema, D.J., Doern, G.V., 2013.
Evaluation of pneumococcal serotyping by multiplex PCR and quellung reactions. Journal of
clinical microbiology 51, 4193-4195.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Rolo, D., A, S.S., Domenech, A., Fenoll, A., Linares, J., de Lencastre, H., Ardanuy, C., Sa-Leao,
R., 2013. Disease isolates of Streptococcus pseudopneumoniae and non-typeable S. pneumoniae
presumptively identified as atypical S. pneumoniae in Spain. PloS one 8, e57047.
Rosselló-Móra, R., Amann, R., 2015. Past and future species definitions for Bacteria and Archaea.
Systematic and applied microbiology 38, 209-216.
Salvà-Serra, F., Connolly, G., Moore, E.R.B., Gonzales-Siles, L., 2017. Detection of "Xisco" gene
for identification of Streptococcus pneumoniae isolates. Diagnostic microbiology and infectious
disease.
Scholz, C.F., Poulsen, K., Kilian, M., 2012. Novel molecular method for identification of
Streptococcus pneumoniae applicable to clinical microbiology and 16S rRNA sequence-based
microbiome studies. Journal of clinical microbiology 50, 1968-1973.
Slotved, H.C., Dalby, T., Hoffmann, S., 2016. The effect of pneumococcal conjugate vaccines on
the incidence of invasive pneumococcal disease caused by ten non-vaccine serotypes in Denmark.
Vaccine 34, 769-774.
Tanmoy, A.M., Saha, S., Darmstadt, G.L., Whitney, C.G., Saha, S.K., 2016. PCR-Based
Serotyping of Streptococcus pneumoniae from Culture-Negative Specimens: Novel Primers for
Detection of Serotypes within Serogroup 18. Journal of clinical microbiology 54, 2178-2181.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Teng, L.J., Hsueh, P.R., Tsai, J.C., Chen, P.W., Hsu, J.C., Lai, H.C., Lee, C.N., Ho, S.W., 2002.
groESL sequence determination, phylogenetic analysis, and species differentiation for viridans
group streptococci. Journal of clinical microbiology 40, 3172-3178.
Varghese, R., Jayaraman, R., Veeraraghavan, B., 2017. Current challenges in the accurate
identification of Streptococcus pneumoniae and its serogroups/serotypes in the vaccine era. J
Microbiol Methods 141, 48-54.
Weinberger, D.M., Malley, R., Lipsitch, M., 2011. Serotype replacement in disease after
pneumococcal vaccination. Lancet 378, 1962-1973.
Welinder-Olsson, C., Kjellin, E., Badenfors, M., Kaijser, B., 2000. Improved microbiological
techniques using the polymerase chain reaction and pulsed-field gel electrophoresis for diagnosis
and follow-up of enterohaemorrhagic Escherichia coli infection. European journal of clinical
microbiology & infectious diseases : official publication of the European Society of Clinical
Microbiology 19, 843-851.
Whatmore, A.M., Efstratiou, A., Pickerill, A.P., Broughton, K., Woodard, G., Sturgeon, D.,
George, R., Dowson, C.G., 2000. Genetic relationships between clinical isolates of Streptococcus
pneumoniae, Streptococcus oralis, and Streptococcus mitis: characterization of "Atypical"
pneumococci and organisms allied to S. mitis harboring S. pneumoniae virulence factor-encoding
genes. Infection and immunity 68, 1374-1382. bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 1. Taxonomic identification of the correctly-classified Streptococcus pseudopneumoniae strains from GenBank.
GenBank groEL similarity ANIb similarity GGDC ANIb all Vs all strain 1st match % 2nd match % 3th match % S. pseudopneumoniae S. pneumoniae S. mitis (%)
1321 S. pseudopneumoniae S. pseudopneumoniae 100 S. mitis 95.8 S. pneumoniae 94.1 98.4 94.2 91.8 86.6 22725 S. pseudopneumoniae S. pneumoniae 99.6 S. pseudopneumoniae 93.9 S. mitis 93.4 96.9 94.1 91.9 74.9 276-03 S. pseudopneumoniae S. pneumoniae 99.5 S. pseudopneumoniae 93.8 S. mitis 93.3 97 94.1 91.9 75.6 338-14 S. pseudopneumoniae S. pneumoniae 99.6 S. pseudopneumoniae 93.9 S. mitis 93.4 97 94.1 91.8 75.5 5247 S. pseudopneumoniae S. pseudopneumoniae 96.4 S. mitis 95.7 S. pneumoniae 95 96.5 94.1 91.9 72.3 61-14 S. pseudopneumoniae S. pneumoniae 99.6 S. pseudopneumoniae 93.9 S. mitis 93.4 96.8 94 91.6 75.5 G42 S. pseudopneumoniae S. pneumoniae 99.1 S. pseudopneumoniae 94.5 S. mitis 93.9 96.9 94 91.8 75.3 IS7493 S. pseudopneumoniae S. pneumoniae 99.6 S. pseudopneumoniae 93.9 S. mitis 93.4 96.8 94 91.9 73.8 SK674 S. pseudopneumoniae S. pseudopneumoniae 100 S. mitis 95.8 S. pneumoniae 94.1 98.8 94.1 91.9 89.9 The results show the 9 of the 39 genome sequences classified as S. pseudopneumoniae at the GenBank database that were confirmed to their taxonomical identity as S. pseudopneumoniae by ANIb and GGDC analyses.
bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 2. Taxonomic identifications of the correctly-classified Streptococcus mitis strains from GenBank.
NCBI ANIb groEL Similarity ANIb similarity GGDC (%) strain all Vs all 1st match % 2nd match % 3th match % S. mitis S. pseudopneumoniae S. pneumoniae
10712 S. mitis S. mitis 100 S. pseudopneumoniae 95.8 S. oralis 94.2 95.0 92.1 91.4 62.3 11/5 S. mitis S. mitis 98.6 S. pseudopneumoniae 95.9 S. pneumoniae 93.5 93.2 91.6 90.5 62.7 21/39 S. mitis S. mitis 97.4 S. pseudopneumoniae 95.8 S. pneumoniae 94.1 93.1 92.3 91.6 53.4 27/7 S. mitis S. mitis 96.7 S. pseudopneumoniae 95.1 S. pneumoniae 93.5 95.5 91.9 91.3 64.1 29/42 S. mitis S. mitis 99.1 S. pseudopneumoniae 95.8 S. oralis 94.2 95.2 91.9 91.2 63.0 OT25 S. mitis S. mitis 97 S. pseudopneumoniae 95.9 S. oralis 94.3 93.8 92.1 91.5 56.0 SK1080 S. mitis S. mitis 96.8 S. pseudopneumoniae 95.9 S. pneumoniae 94.5 92.9 93.2 92.7 51.8 SK271 S. mitis S. mitis 99.3 S. pseudopneumoniae 95.8 S. oralis 93.8 95.6 92.3 91.5 64.9 SK321 S. mitis S. mitis 96 S. pseudopneumoniae 94.5 S. pneumoniae 94.2 93.6 92.2 91.4 55.8 SK578 S. mitis S. mitis 96 S. pseudopneumoniae 95.4 S. pneumoniae 94.1 93.1 91.5 90.8 54.1 SK642 S. mitis S. mitis 96.6 S. pseudopneumoniae 95.7 S. pneumoniae 94.1 92.5 91.3 90.9 50.0 SK579 S. mitis S. pneumoniae 96 S. mitis 95 S. pseudopneumoniae 94.9 92.6 91.5 90.9 50.9 1111_SMIT S. mitis S. mitis 94.2 S. pseudopneumoniae 93.8 S. oralis 93.5 95.1 92.1 91.2 54.1 13/39 S. mitis S. mitis 95.8 S. pseudopneumoniae 95.4 S. pneumoniae 93.8 93.4 91.6 90.8 54.4 17/34 S. mitis S. mitis 94.6 S. oralis 94 S. pseudopneumoniae 93.7 92.2 91.3 90.8 48.7 18/56 S. mitis S. mitis 94.7 S. pseudopneumoniae 94.7 S. pneumoniae 93.7 93.1 91.5 91.1 53.0 850_SMIT S. mitis S. pneumoniae 95.8 S. mitis 95 S. pseudopneumoniae 94.7 93.1 92.4 91.8 53.9 B6 S. mitis S. mitis 94.6 S. pseudopneumoniae 93.5 S. pneumoniae 93.5 93.3 91.9 91.5 53.5 CMW7705B S. mitis S. mitis 94.5 S. pseudopneumoniae 94.1 S. oralis 93.7 93.2 92.3 91.7 53.6 DD22 S. mitis S. mitis 92.9 S. pseudopneumoniae 92.6 S. pneumoniae 91.7 93.2 91.4 90.8 53.7 DD26 S. mitis S. pseudopneumoniae 95.1 S. pneumoniae 94.7 S. mitis 94.5 92.4 93.1 92.8 49.7 DD28 S. mitis S. pneumoniae 95.3 S. mitis 93.7 S. pseudopneumoniae 93.7 92.6 92.7 92.1 50.6 KCOM 1350 S. mitis S. mitis 95 S. pneumoniae 94.7 S. pseudopneumoniae 94.6 94.0 91.8 91.3 56.9 SK1073 S. mitis S. mitis 95.7 S. pseudopneumoniae 94.1 S. pneumoniae 93.4 94.0 91.8 91.3 56.9 SK1126 S. mitis S. mitis 94.5 S. pseudopneumoniae 93.4 S. pneumoniae 93.1 93.0 91.7 91.1 52.2 SK145 S. mitis S. mitis 95.4 S. pseudopneumoniae 95.1 S. pneumoniae 93.8 94.6 92.1 91.3 60.0 SK564 S. mitis S. mitis 95.9 S. pseudopneumoniae 95.1 S. pneumoniae 94.2 92.7 92.3 91.6 52.1 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
SK569 S. mitis S. pneumoniae 94.7 S. mitis 94.2 S. pseudopneumoniae 93.5 92.6 91.7 91.1 50.9 SK575 S. mitis S. mitis 95 S. pseudopneumoniae 95 S. pneumoniae 94.2 93.2 92.4 91.6 53.3 SK597 S. mitis S. pseudopneumoniae 95.9 S. mitis 94.9 S. pneumoniae 93.5 93.0 92.6 91.9 52.5 SK608 S. mitis S. mitis 94.9 S. pneumoniae 94.3 S. oralis 93.7 93.1 92.1 91.5 53.7 SK616 S. mitis S. pneumoniae 94.7 S. mitis 94.2 S. pseudopneumoniae 93.5 92.4 91.6 91.1 50.4 SK629 S. mitis S. mitis 94.6 S. pseudopneumoniae 94.5 S. pneumoniae 93.3 92.3 91.5 91.2 50.2 SK637 S. mitis S. mitis 94.5 S. oralis 94.2 S. pseudopneumoniae 93.7 93.7 92.0 91.2 55.0 SK667 S. mitis S. mitis 95.4 S. pseudopneumoniae 94.5 S. pneumoniae 93.8 92.3 92.0 91.2 50.5 SVGS_061 S. mitis S. pneumoniae 93.5 S. mitis 93 S. oralis 93 92.0 91.5 90.7 49.4 M3-1 S. mitis a 93.2 90.7 91.3 54.3 M3-4 S. mitis a 93.2 90.7 91.4 54.3 SK137 S. mitis a 94.8 91.2 92.0 60.5 SK137 S. mitis a 94.8 91.2 92.1 60.4
The results show the 36 of the 52 genome sequences classified as S. mitis at the GenBank database that were confirmed to their taxonomical identity as S. mitis by ANIb and GGDC analyses. a groEL gene was not present in the genome sequence bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 3. Serotypes identified by sequetyping, using in silico analysis of the 329 genome sequences of S.
pneumoniae from GenBank included in our local database.
Serotype #strains 1 33 2/41A 1 3 14 4 11 5 1 6 32 7A/7F 7 8 2 9A/9V 11 9N/9L 2 10A 3 10C 1 10C/10Fa 8 11a/11D/18F 6 12a/46 3 13/20 2 14 17 21 1 15A/15F 4 16F 1 17F/33C 4 18B/18C 3 19A 48 19B 1 19C 2 19F 28 22A/22F 5 23A 2 23F 16 24Fa 7 33A/35A/33F 2 35F/47F 1 NT 36 NI 14 a match with low similarity value NT, non-typable NI, non-identified
36 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 4. Serotype identification of strains isolated from clinical samples.
Antiserum panela Real-time PCR Sequetyping # positives 3 3 3 5 4 4 4 1 6A 6A/6B/6C/6D 6A/6B/6C/6D 1 6B 6A/6B/6C/6D 6A/6B/6C/6D 1 7F 7F/7A 7F/7A 5 9N 9N/9L 9N/9L 2 9V 9A/9V 9A/9V 1 8 8 8 2 31 NA 31 1 11A 11A/11D 11A/11D/18F 2 11D 11A/11D 11A/11D/18F 1 12F 12F/12A/44/46 12F 1 15A NA 15A 2 15B 15B/15C 15B/15C 1 15C 15B/15C 15B/15C 1 18A 18 18A 1 18C 18 18B/18C 3 19A 19A 19A 1 19F 19F 19F/19A 1 20 20 20/13 2 22F 22F/22A 22F/22A 4 23B NA 23B 1 23F 23F 23F 1 33F 33F/33A/37 35A/33F/33A 3 35A NA 35C/35B/35A 2 35B NA 35C/35B/35A 2 35F NA 47F/35F 2 NA, non-amplified (not targeted by the assay) a Performed at the Public Health Agency of Sweden
37 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Table 5. In silico identification of serogroup 6.
a b PCR PCR S. pneumoniae Accession PCR 1 PCR 2 PCR d e 4 5 Serotype 3c Strain number A/G Serogroup PCR Deletion S1-A2 S2-A1 07AR0125 AFBY01000000.1 G AC + + - + + 6C 801 AQTO01000000.1 G AC + - + - - 6A 845 AQTP01000000.1 G AC + - + - - 6A 1488 AQTQ01000000.1 A BD + - - - - 6B 670-6B CP002176.1 A BD + - + - - 6B BHN191 ASHN01000000.1 A BD + - - - - 6B BHN237 ASHO01000000.1 A BD + - + - - 6B BHN418 ASHP01000000.1 A BD + - + - - 6B BHN427 ASHQ01000000.1 A BD + - + - - 6B BR1064 AFBZ01000000.1 G AC + + + + + 6C CCUG1350 LQQG01000000.1 A BD + - + - - 6B CDC1873-00 ABFS01000000.1 G AC + - + - - 6A EU-NP04 AIKH01000000.1 G AC + + + + + 6C GA02270 AIKJ01000000.1 G AC + - - - - 6A GA02506 AILJ01000000.1 A BD + - - - - 6B GA02714 AIKK01000000.1 G AC + - - - - 6A GA14373 AILN01000000.1 G AC + - - - - 6A GA17328 AGPH01000000.1 G AC + - - - - 6A GA17971 AGPJ01000000.1 G AC + - + - - 6A GA19077 AGPK01000000.1 G AC + - + - - 6A GA41437 AGPN01000000.1 G AC + - + - - 6A GA47033 AGOA01000000.1 G AC + + + + + 6C GA52306 AGPZ01000000.1 G AC + + + + + 6C GA60080 ALCR01000000.1 G AC + + + + + 6C GA60132 ALCV01000000.1 G AC + + + + + 6C GA60190 ALCL01000000.1 G AC + + + + + 6C NorthCarolina6A-23 AGQL01000000.1 G AC + - + - - 6A NP127 AGQC01000000.1 G AC + - + - - 6A SP6-BS73 ABAA01000000.1 G AC + - + - - 6A SPAR55 ALCF01000000.1 G AC + - + - - 6A WL400 AVFA01000000.1 G AC + - + - - 6A K15-99 HQ662206.1 A BD - --- - + + 6D K15-60 HQ662205.1 A BD - --- - + + 6D K15-17 HQ662214.1 A BD - --- - + + 6D K15-129 HQ662208.1 A BD - --- - + + 6D K15-115 HQ662207.1 A BD - --- - + + 6D K13-22 HQ662215.1 A BD - --- - + + 6D K13-110 HQ662218.1 A BD - --- - + + 6D
38 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
K13-109 HQ662217.1 A BD - --- - + + 6D K13-108 HQ662216.1 A BD - --- - + + 6D B0704-047 HQ662213.1 A BD - --- - + + 6D 07-107 HQ662212.1 A BD - --- - + + 6D 07-077 HQ662211.1 A BD - --- - + + 6D 07-056 HQ662210.1 A BD - --- - + + 6D Tw02-238 HQ662209.1 A BD - --- - + + 6D a wciP374F - wciP-r; bDel6Cwzy_Fv2 - Del6Cwzy_R; cwciN_6AB_F - wciN_6AB_R; dwciNbetaS1 – wciNbetaA2; ewciNbetaA1 – wciNbetaS2
39 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 1. Schematic representation of the targeted cpsB gene in the conserved region with cpsA,
cpsB, cpsC and cpsD within the CPS loci of Streptococcus pneumoniae for amplification, using
two primer pairs.
40 bioRxiv preprint doi: https://doi.org/10.1101/415422; this version posted September 12, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 2. Schematic representation of serogroup 6 differentiation.
41