Molecular characterization and epidemiology of Streptococcus pneumoniae serotype 8 in Denmark

Background Streptococcus pneumoniae serotype 8 incidence has increased in Denmark after the introduction of pneumococcal conjugated vaccines (PCV). The mechanism behind the serotype 8 replacement is not well understood. In this study, we aimed to present epidemiological data on invasive pneumococcal disease (IPD) and molecular characterization of 96 serotype 8 clinical isolates. Methods IPD data from 1999 to 2019 were used to calculate the incidence and age distribution. Whole-genome sequencing (WGS) analysis was performed on 96 isolates (6.8% of the total serotype 8 IPD isolates in the period) to characterize the isolates with respect to pneumococcal lineage traits, a range of genes with potential species discrimination, presence of colonization and virulence factors, and molecular resistance pattern. Results The serotype 8 IPD incidence increased significantly (P < 0.05) for the age groups above 15 years after the introduction of PCV13, primarily affecting the elderly (65+). All isolates were phenotypically susceptible to penicillin, erythromycin and clindamycin. Molecular characterization revealed seven different MLST profiles with ST53 as the most prevalent lineage (87.5%) among the analyzed serotype 8 isolates. The genes covering the cell-surface proteins: lytA, rspB, pspA, psaA & Xisco and the pneumococcal toxin pneumolysin = ply were present in all isolates, while genes for the membrane transporter proteins: piaA/piaB/piaC; the capsular genes: cpsA (wzg) & psrP; the metallo-binding proteins zmpB & zmpC; and the neuroamidase proteins: nanA/nanB were variably present. Surprisingly, the putative transcriptional regulator gene SP2020 was not present in all isolates (98%). Susceptibility to penicillin, erythromycin and clindamycin was molecularly confirmed. Conclusion The observed serotype 8 replacement was not significantly reflected with a change in the MLST profile or changes in antibiotic resistance- or virulence determinants. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-021-06103-w.


Background
Infections with Streptococcus pneumoniae affect all age groups, although predominantly children and the elderly. Invasive pneumococcal diseases (IPD) such as bacteremia, meningitis, and pneumonia cause high morbidity and mortality worldwide [1] and can be divided into serotypes based on their capsular polysaccharide with up to least 100 acknowledged serotypes [2,3].
The introduction of the pneumococcal conjugate vaccine (PCV) -7-valent (PCV7) in 2007 and 13-valent (PCV13) in 2010has changed the epidemiology of serotype-specific IPD in Denmark [4]. While the IPD incidence for the majority of serotypes included in the PCV13 vaccine has decreased, the incidence of non-PCV serotypes such as serotype 8 has increased post PCV vaccination [5]. The frequency of serotype 8 IPD has increased globally from 2013 to 2017 with 120% [6]. The observed replacement and increase of serotype 8 is similar to the emergence of the virulent and multidrug resistant serotype 19A observed in Massachusetts, USA, after introduction of PCV7, although an increase in multidrug resistance has not been observed [7].
In Europe, the current serotype 8 multi locus sequence type (MLST) is dominated by ST53 [8,9], which constitutes up to 90.1% in Spain [9], and is related to the major clone Netherlands 8 -33 (https://www.pneumogen. net/pmen/clone-collection.html, accessed 14-04-2021) [10]. Alteration in the clonal epidemiology or other virulence related pneumococcal genes could be the explanation for the post PCV increase of serotype 8, as observed with serotype 19A ST320 post PCV7 in USA [7] or with serotype 19A ST199 and ST994 and the change in PCV strategies in Belgium [11]. In Denmark, we have observed a post PCV13 increase in serotype 24F ST162 [12]. However, even though the non-PCV serotype 8 for several years has been the dominant cause of IPD in Denmark, little is known about the serotype. Serotype 8 is an example of successful serotype replacement due to PCV introduction in Denmark. It is, therefore, the intention of this study to investigate the mechanism behind the significant increase in serotype 8 IPD cases post PCV, by evaluating epidemiological data, the clonal epidemiology, and the presence/absence of virulence related pneumococcal genes. Whole-genome sequencing (WGS) analysis was performed on a representative number of serotype 8 isolates to investigate potential changes in the clonal distribution, molecularly susceptible related genes, and species-specific virulence genes pre-and post-PCV vaccination.

Strain collection
Data from all S. pneumoniae serotype 8 IPD isolates from 1999 to 2019 were retrieved from the national Neisseria and Streptococcus Reference Laboratory (NSR), Statens Serum Institut (SSI) ( Table 1). The registered S. pneumoniae serotype 8 IPD cases where either isolates or pneumococcal DNA obtained from sterile fluids such as blood, cerebrospinal fluid, joint fluid etc. [13]. The cases consisted of more than 90% bacteremia cases, 5% meningitis cases, and less than 5% other infections found in normally sterile sites. Due to the limited number of meningitis cases and other infections these data were combined in one group. Detailed data on the total number of IPD cases in Denmark for all age groups have previously been presented [4,5]. Population data (1999-2019) were obtained from Statistics Denmark (www.dst.dk, accessed 10-03-2021). The number of specific serotype 8 IPD cases and of total IPD cases per year in Denmark is presented in Table 1.

Data analysis
The data were analyzed using RStudio version 1.2.5001 and R version R-3.6.1 (http://www.r-project.org/, accessed 10-03-2021) for descriptive statistical analysis. Calculations consisted of t-test, two-tailed Fisher's exact test, and a generalized linear model to calculate incidence rate, incidence rate ratio (IRR), p-value, and confidence interval (CI).

Identification of S. pneumoniae isolates
The serotype 8 IPD isolates were phenotypically speciesconfirmed based on optochin susceptibility and identification of the serotype. Serotyping was performed either by the Quellung reaction alone or by the ImmuLex™ Pneumotest Kit (SSIDiagnostica.com, Hillerød, Denmark) combined with the Quellung reaction using type-specific pneumococcal rabbit-antisera (SSIDiagnostica.com, Hillerød, Denmark) [12,14].

Characterization of 96 clinically selected isolates
From a total of 1378 S. pneumoniae serotype 8 IPD isolates collected from 2006 to 2018, 92 (6%) were selected for molecular characterization using WGS. Of the selected isolates, two were from 2006 and 10 isolates (90 isolates in total) were selected from each of the years 2007, 2009-2014, 2016, and 2018. The isolates were not randomly selected but chosen based on the 65+ age group, gender, and collection site, and represented patients from all parts of the country. The isolates were from patients with a mean age of 72.7 years, 50% of the isolates were from female patients, and 90% were bacteremic cases.
Data on the 96 selected isolates are presented in Table 2.

Molecular species identification
WGS was performed as described by Kavalari et al [12]. The selected isolates were sequenced by paired-end Illumina sequencing, where the genomic DNA was extracted using DNeasy Blood and Tissue Kit (QIAGEN, Hilden, Germany) and fragment libraries were made by Nextera XT Kit (Illumina, Little Chesterford, UK), followed by a 250 bp paired-end sequencing (MiSeqTM; Illumina) according to the manufacturer's instructions. The paired-end Illumina data were de novo assembled using SKESA assembler [16]. All bioinformatics were performed using the free software NCBI genome workbench (version 3.0.1, https://www.ncbi.nlm.nih.gov, accessed 14-04-2021).

Molecular characterization of the capsular genes of the selected isolates
All 96 isolates were genetically confirmed to be S. pneumoniae as described by Kavalari et al. [12]. The presence/absence of a gene was based on a cut-off of 95% identity and 80% coverage as a definition of a gene detection in this study [2,17,18]. The isolates were serotyped using PneumoCaT (version = "1.2") [2]. The genomic sequence data for the 96 isolates have been deposited in the ENA Genbank under project no. PRJEB42355 (https://www.ebi.ac.uk/ena/browser/view/ PRJEB42355, accessed 19-04-2021).

Gene profile
The tested genes were based on selected studies: Presence of virulence genes lytA and ply coding for autolysin and the pore-forming toxin pneumolysin [20].
Pathogenwatch (version 3.7.5, https://pathogen.watch/, accessed 10-03-2021) was used to determine Global Table 2 Selected genes detected in the S. pneumoniae isolates. These parameters were used for positive gene detection: Cut-off of overlap as 80 and 95% identity. LytA Pneumococcal Sequence Clusters (GPSC), and to confirm organism, serotype, MLST and PBP. An overview of the susceptible and related genes can be seen in Table 2.

Ethical considerations
The data and samples from patients are collected routinely for national surveillance purposes, no ethical approval or informed consent from patients or guardians are required. Statens Serum Institut (SSI) which is under the auspices of the Danish Ministry of Health, have a general approval by the Danish Data Protection Agency (record number 2007-41-0229) (https://en.ssi.dk/ research, accessed 14-04-2021, https://en.ssi.dk/aboutus, accessed 14-04-2021) to publish the data. All presented data are anonymized and cannot be connected to a patient.

Results
Incidence of invasive pneumococcal disease in 1999-2019 due to serotype 8 The mean serotype 8 IPD incidence for most age groups in the period 2008-2010 increased compared to 1999-2007, whereas all age groups with IPD cases in 2011-2019 increased compared to 1999-2007 and 2008-2010 (Table 3). The age groups above 65 years had the highest mean incidence in 2011-2019 and the age group '2-4' had the lowest one ( Table 3).
The effect of PCV13 on serotype 8 IPD (introduced gradually in 2010) using a two-tailed Fisher's exact test by comparing period 2008-2010 to period 2011-2019 showed a significant increase in the incidence of serotype 8 IPD for all groups above the age of 15 (Table 3).
The effect of the PCV vaccines against serotype 8 IPD using the GLM statistic, the serotype 8 IPD incidence in the age groups '65-74' and '85+' (P = 0.02 and P = 0.001, respectively) increased significantly with an average of 13-14% per year in the period from 2011 to 2019. The serotype 8 IPD incidence peaked in 2015-2018 and decreased for the majority of age groups in 2019 (Fig. 2).

MLST
Seven different MLST profiles were detected among the 96 isolates of serotype 8, of which 84 (87.5%) were ST53, six ST404, two ST1480, one ST7203, one ST3714, one ST2234, and one had an unknown ST profile. Four different GPSC were detected, of which 85 were GPSC3   (consisting of ST53 and the unknown ST), nine GPSC98 (consisting of ST404, ST1480 and ST7203), one GPSC224 (consisting of ST3714) and one GPSC336 (consisting of ST2234) (Fig. 1, Table 2). It was not possible to detect any clustering. ST53 was detected in isolates from all parts of the country, ST404 was detected in isolates from four different hospitals from both Jutland and Zealand, and ST1480 were detected in samples from two different hospitals representing Jutland and Zealand.

Comparison of phenotypic and genotypic susceptibility profiles
All isolates were genetically identified as serotype 8, confirming the detected phenotype. All of the 92 recent isolates with available antimicrobial susceptibility profiles were phenotypically susceptible to penicillin, erythromycin, and clindamycin. Four different PBP signatures were detected of which 85 isolates were '3-6-5', nine were '3-2-5', one '50-0-0', and one was '0-4-2' (Fig. 1, Table 2). When performing a ResFinder search, three isolates showed genes with 99% identity to tetracycline resistance genes. The isolates were of sequence types ST53 and the clonally related new ST type ( Fig. 1 and Table 2). The estimated tetracycline resistance is 3.13% of the total 96 isolates based on WGS.

Comparison of phylogenetic trees
A correlation of isolates GPSC, MLST and PBP was demonstrated in the SNP phylogenetic tree when isolates in the same GPSC clustered together and grouped according to MLST and PBP (Fig. 1). All ST53 isolates were arranged close to the root isolate ('8-1-1950'), with two clades of twelve and three ST53 isolates in a separate clade further from the root. The clade with three isolates had genes for tetracycline resistance (isolates 243-2010, . Comparison of the SNP tree with an rMLST-based tree from PubMLST showed identical branches and separation of isolates (data not shown).

Characterization of selected pneumococcal genes
The genes lytA, ply, xisco, rpsB, pspA, and psaA were detected in all 96 pneumococcal serotype 8 isolates, the genes sp2020, piaA/piaB/piaC, zmpC, and nanA/nanB were detected in the majority of isolates, and the genes zmpB, cpsA and psrP were absent in all isolates (Table 2).
PiaA/piaB/piaC were present in all of the isolates before PCV13 introduction and absent in 12.5% of the isolates after the introduction of PCV13 (Table 2). Only ST53 and the clonally related isolates carried the zmpC gene, while it was absent in all other isolates.

Discussion
Introduction of the PCV vaccines has reduced IPD in children and other age groups. However, the introduction of the PCV vaccines has also contributed to an increase in the proportion of the non-vaccine serotypes such as serotype 8 in countries like Denmark, Fig. 2 Incidence of serotype 8 IPD per 100,000. Arrows indicate when PCV7 and PCV13 were introduced. 'Total' includes all age groups the United Kingdom, Spain, and other European countries [6], showing a post-vaccination serotype replacement [5,6,33,34].
The serotype 8 IPD incidence in Denmark increased after the PCV13 introduction, predominantly affecting the age groups above 65 years, and a significant increase was observed for the age group 85+ after the introduction of PCV7; the serotype 8 IPD incidence increased in 2008-2009 and then decreased to pre-PCV7 levels in 2010 (Fig. 2, Table 3). The large increase of serotype 8 in Denmark (Fig. 2, Tables 1 and 3) after the introduction of PCV13 was not foreseen in any Danish published IPD and pneumococcal carriage data up to December 2013 [4,13,35]. It was observed that around 80% of IPD cases in 2012-2013 were caused by non-vaccine serotypes (8, 10A/B, 12F, 15B/C, 20, 22F, 33F, 38, 23B, 24F), with no clear predominance of any specific serotype [4]. In 2014, Danish IPD data on non-vaccine serotypes indicated the dominance of serotype 8, although at that time it was not clear that serotype 8 would continue to be the leading cause of IPD in Denmark (Table 1) [5]. Neither did Danish carriage studies in children below 5 years of age in 2000 [35] and below 2 years of age in 2014 to 2016 [36] show any indication of high carriage of serotype 8, which could explain the transmission to the elderly. Similar carriage data on serotype 8 in children below 5 years of age showing limited carriage have been observed in other countries [37]. It has furthermore been found that there is a limitation in using carriage data from children to forecast changes in general IPD epidemiology, and that serotype 8 is a possible example of a serotype transmitted directly among older age groups [38]. This observation is supported by studies from the UK performed on other age groups than children, in which they observed serotype 8 carriage [37,39]. In Denmark, no carriage studies on other age groups than children have been performed, which suggests a direct transmission among other age groups [36].
The current Danish pneumococcal data are not able to provide an explanation or warning of the present dominance of serotype 8 [4,5,13,35,36]. Moreover, the epidemiological data does not provide an explanation for the dominance of serotype 8 IPD cases observed in Denmark.
At present only the pneumococcal polysaccharide vaccine (PPV23) includes serotype 8, which has shown a significant vaccine efficacy against serotype 8, although the protection is of limited duration [40]. The duration of protection can explain the limited effect of PPV23 in England against serotype 8 IPD despite a national PPV23 immunization program for the age group of 65+ since 2003 [40,41]. The serotype 8 IPD in Denmark predominantly affects the age groups above 65 years (Fig. 2,  Tables 1 and 3), and it will be important to monitor the serotype 8 IPD incidence with the introduction of PPV23 into a vaccination program for risk groups and the elderly 65+ [42].
Serotype 8 is often observed to be susceptible to antimicrobial drugs [8]. Spain has, however, seen an emergence and spread of S. pneumoniae serotype 8 ST63, a multidrug resistant clone resistant to erythromycin, clindamycin, tetracycline, and ciprofloxacin [8].
In Denmark we have not observed any occurrence of non-susceptible serotype 8 isolates (DANMAP, https:// www.danmap.org/, accessed 10-03-2021), and the post PCV13 increase in serotype 8 incidence has not shown any changes in the susceptibility of serotype 8 isolates. The PBP profiles of the sequence isolates in this study corresponded well with the predicted PBP profile and the phenotypic susceptibility testing (Fig. 1) [12,32].
The S. pneumoniae serotype 8 MLST type is ST53 belonging to cluster GPSC3 [43][44][45][46], constituting 80% of the sequenced isolates in this study. The ST53 clone was found to be dominant both before and after the introduction of PCV7 and PCV13 ( Table 2). The increase in serotype 8 can, therefore, not be related to changes in serotype 8 clones. Other serotype 8 MLST types observed in this study, such as ST404 and ST1480, have been reported in other European countries, Brazil, and The UK [45,[47][48][49][50][51], while MLST types ST3714 and ST2234 have only been observed in Denmark, Sweden, Turkey, Belarus, the UK, Saudi Arabia, and Kenya (PubMLST DataBase, https://pubmlst.org/spneumoniae/, accessed 10-03-2021). The historical isolate 8-4-1962 (ST7203) was related to clone ST404 and was in the same GPSC98 cluster. An unknown ST type was detected in isolate 243-2010, which had six of seven identical allelic variants with ST53 and was in the same GPSC3 cluster (Fig. 1). Overall, all MLST types in this study were known as susceptible clones, although three isolates showed the presence of the tet(M) gene ( Table 2).
The SNP phylogenetic tree showed that it was not possible to see any clades of isolates segregated by the year before and after the PCV introduction, indicating that it might not be a gene mutation causing the serotype 8 increase (Fig. 1). The tree illustrates two clades of twelve and three ST53 isolates, respectively, that were separated from the majority of ST53 isolates. The differentiation of the clades could, however, not be linked to the year of isolation. We do not know the basis of the difference for the twelve isolates based on the genes selected in this study, and further gene analysis needs to be performed to reveal which genes were responsible for the discrepancy. The clade of three isolates showed molecular tetracycline resistance, differentiating them from the majority of the ST53 isolates.
Comparing the SNP tree with a tree based on the 53 rMLST genes from PubMLST species identification showed nearly identical branches, although the SNP tree showed more details in the branches, as the clade with the three isolates containing the tet(M) gene was not present in the rMLST tree (data not shown). In general, the authors found that the species ID identification using PubMLST rMLST was easy to use; however, it did not provide any additional information on the cause for the increase in serotype 8.
Evaluation of species-specific genes described in various studies [12,20,30,52] did not show a clear presence/absence of genes defined by the PCV introduction ( Table 2). The generally used lytA gene and other genes suggested for species identification of S. pneumoniae were detected in all our isolates ( Table 2) similar to our previous observations [12]. Some genes were not observed in all our isolates; SP2020 [30] was not found in two of our isolates ( Table 2). The zmpC gene was present in all ST53 isolates and in the clonally related isolate, while it was absent in all other ST types, which is consistent with observations from previous studies [22]. However, interestingly the zmpC gene has been described to suppress S. pneumoniae virulence in experimental models of pneumococcal meningitis [21]. In this study, specific meningitis data are too limited to evaluate the effect on the number of meningitis cases; however, the zmpC gene was found in isolates from cerebrospinal fluid and did not seem to be linked to reduced invasiveness of serotype 8 ( Table 2). The genes piaA/piaB/piaC were present in all isolates before the introduction of PCV13. However, they were lacking in 12.5% of the isolates (8 isolates) after the introduction. Although the absence of the genes piaA/piaB/piaC first appeared after the PCV13 introduction, it does not explain the increase in serotype 8, as only a limited number of isolates lacked the genes (Table 2). Interestingly, the SP2020 or piaB gene in combination with the lytA gene has been suggested for the detection of pneumococcal pure cultures or swab samples [21,30]. However, when analyzing the 96 isolates in this study, we observed isolates which did not include the SP2020 or piaB gene ( Table 2). It is therefore questionable how favorable these genes are compared to the use of the ply gene for detection of Danish pneumococcal isolates. All isolates in this study (Table 2) and the study by Kavalari et al [12] showed the presence of the ply gene. It has furthermore been described that the piaB gene only lacks in non-typeable pneumococci [21]. In this study, however, the piaB gene was not found to be unique for the invasive capsulated isolates, as 7 isolates lacked the gene ( Table 2).
A limitation of the study includes that not all serotype 8 isolates were sequenced with the caveat that we might not detect possible mini-outbreaks of specific clones as a possible explanation for the observed serotype 8 replacement in Denmark. However, the number of isolates was sufficient enough to find interesting sequence results (e.g. isolates not possessing the SP2020 or piaB genes (Table 2).

Conclusion
In conclusion, with the introduction of PCV13 in the child vaccination program in Denmark, a significant (P < 0.05) increase was observed in the non-PCV serotype 8 IPD incidence for the age groups above 65 years, demonstrating serotype replacement in Denmark. No reason was found for the successful replacement of serotype 8 based on the Danish epidemiological studies. Furthermore, the increase in serotype 8 was not followed by an increase in non-susceptible serotype 8 isolates or by a change in clones, as the majority of molecularly characterized isolates belonged to the ST53 clone. Analysis of potential changes in the clonal distribution, molecularly susceptible related genes, and speciesspecific genes pre-and post-PCV vaccination did not show any changes which could be related to the PCV introduction in Denmark. Therefore, future studies still need to identify a possible marker for why serotype 8 is so successful in replacing the PCV included serotypes in Denmark, and thereby possibly improve the prediction of the next non-PCV serotype causing high incidence of IPD in Denmark.