Comparative analysis of Streptococcus pneumoniae transmission in Portuguese and Finnish day-care centres

Background Day-care centre (DCC) attendees play a central role in maintaining the circulation of Streptococcus pneumoniae (pneumococcus) in the population. The prevalence of pneumococcal carriage is highest in DCC attendees but varies across countries and is found to be consistently lower in Finland than in Portugal. We compared key parameters underlying pneumococcal transmission in DCCs to understand which of these contributed to the observed differences in carriage prevalence. Methods Longitudinal data about serotype-specific carriage in DCC attendees in Portugal (47 children in three rooms; mean age 2 years; range 1–3 years) and Finland (91 children in seven rooms; mean age 4 years; range 1–7 years) were analysed with a continuous-time event history model in a Bayesian framework. The monthly rates of within-room transmission, community acquisition and clearing carriage were estimated. Results The posterior mean of within-room transmission rate was 1.05 per month (Portugal) vs. 0.63 per month (Finland). The smaller rate of clearance in Portugal (0.57 vs. 0.73 per month) is in accordance with the children being younger. The overall community rate of acquisition was larger in the Portuguese setting (0.25 vs. 0.11 per month), in agreement with that the groups belonged to a larger DCC. The model adequately predicted the observed levels of carriage prevalence and longitudinal patterns in carriage acquisition and clearance. Conclusions The difference in prevalence of carriage (61% in Portuguese vs. 26% among Finnish DCC attendees) was assigned to the longer duration of carriage in younger attendees and a significantly higher rate of within-room transmission and community acquisition in the Portuguese setting.


Background
Streptococcus pneumoniae (pneumococcus) is one of the most important bacterial causes of respiratory tract infections worldwide [1]. While it can cause serious illnesses, pneumococcus carriage in its ecological niche, the human nasopharynx, is generally asymptomatic. More than 90 serotypes have been described based on differences in the polysaccharide capsule [2]. Colonisation by pneumococcus can occur soon after birth and remains common in the first years of life [3], with virtually every child experiencing a sequence of colonisation events by alternating serotypes [4]. Previous statistical epidemiology studies have found that serotypes compete to some extent for colonisation, as carriage of a certain serotype interferes with subsequent colonisation by pneumococcus of a different serotype [5][6][7][8][9][10].
Serotype specificities in transmission and competition parameters are subject to ongoing research and debate [7,9,11,12]. Cauchemez et al. [11] concluded that there were no significant differences in rates of acquisition and clearance between vaccine and non-vaccine serotypes within school classes in France. Using the same dataset, de Cellès et al. [12] reported significant differences in the rates of acquisition estimated independently for different serotypes while assuming shared clearance rates. Erästö et al. [7] estimated serotype-specific parameters simultaneously in Bangladeshi families and concluded that, while more common serotypes in the study area were more often acquired from the community, these differences loose significance when conditioning on exposure within the family. More recently, Lipsitch et al. [9], focusing on the competitive ability of specific serotypes in Kenya, reported significant differences in susceptibility to competition, which appears positively correlated with the rate of clearance.
Studies that estimate pneumococcal transmission parameters have either used different models, statistical frameworks, parametrisations or transmission contexts, for example families, hospitals or day-care centres (DCCs). Our aim was to compare two different pneumococcal settings in Europe, one with high carriage prevalence (Portugal) and one with low carriage prevalence (Finland) by estimating pneumococcal transmission parameters using a single statistical method with data from a common context, in this case, DCCs. We apply a Bayesian data augmentation approach [6,13] to estimate acquisition and clearance rates in day-care centres in Portugal [14] and Finland [15]. We assume shared parameters among serotypes in each country and focus on the comparison between the two settings.

Methods
The empirical data Two datasets were employed in this study, both of which have been described in detail elsewhere [14,15]. For the first dataset, all children from three rooms (n c = 47) of a day care centre (DCC) with approximately 150 children in Lisbon, Portugal, were due to 11 scheduled nasopharyngeal samples with approximately one month sampling interval from February 1998 to February 1999. No samples were collected in July and August as the DCC was closed for the summer break. There were 16, 15 and 16 attendees in the three rooms, with mean age of 2 years (range 1.2-3.1) at the beginning of the study. Altogether, 80% of the scheduled samples were obtained, resulting in a total of 416 samples from the three DCC rooms.
Approval for the Portuguese study was obtained from the Ministry of Education and the director of the day care centre. Signed informed consent was obtained from the parents or guardians of all children.
For the second dataset, a portion of attendees (n c = 61) in all seven rooms in three DCCs with a total of approximately 150 children in the Tampere region, Finland, were due to 10 scheduled nasopharyngeal samples with approximately one month sampling interval from September 2001 to May 2002. In the three DCCs, 25, 18 and 18 children were enrolled, respectively, and 90%, 82% and 99% of the samples were obtained, resulting in a total of 215, 148 and 179 samples. At the beginning of the study, the mean age of the enrolled children was 4.2 years (range 1.2-6.6).
In the Finnish study, informed consent was obtained from parents or guardians of the children. The study was conducted in compliance with the Helsinki Declaration and the ethics committee of the Pirkanmaa Hospital District (PSHP) gave a favourable opinion on the study protocol.
In both studies, calcium alginate swabs were taken from the deep nasopharynx. Pneumococcal identification and serotyping was done following routine procedures as previously described [14,15]. Only one serotype per positive sample was identified. In the Portuguese dataset, the genotype and antibiotic resistance of pneumococcal isolates were also identified but only the serotype information is used in the present study. Among the few clones that were merged into one serotype, most were segregated in different rooms and represented only few samples. In a reanalysis of samples from one of the Finnish day care centres (rooms F1 and F2), samples were incubated and cultured after an enrichment step. Pneumococci were identified from both blood and blood agar with gentamycin plates and serotyped. In addition, multiple serotypes were searched directly by the Quellung method [16].

Transmission model
We modelled transmission dynamics of pneumococcal serotypes in children as previously described [6,13]. Children were assigned a state corresponding to their carriage status. Thus, at any given time, a child was either a non-carrier (state s = 0) or carrier of one of the n s serotypes (state s ∈ {1, . . ., n s }). The rate of acquisition was taken to depend on the prevalence of pneumococci in the DCC room as well as on exposure from outside the room. We assumed that a carrier could only acquire serotypes different from the currently colonising serotype and that the rate of acquisition may be affected by current carriage. All serotypes were assumed to share the same acquisition and clearance rates.
For a non-carrying child i, the rate of acquisition of serotype s at time t was defined as where n i is the number of attendees in the room of the child and C s i t ð Þ is the number of carriers of serotype s in his/her room just before time t. Parameter β is the rate at which one carrier transmits carriage to other children in the room. Parameter к is the community force of infection (per serotype), which can be interpreted as the part of acquisition rate which cannot be assigned to observed exposure within the room.
For an individual currently carrying pneumococcus, the rate of acquisition was multiplied by a competition parameter (relative rate) φ. The clearance rate within child i was modelled using a Weibull hazard with shape parameter ρ and rate parameter μ : is the acquisition time of the carriage episode. The shape parameter ρ of the Weibull distribution was set to value 3 to prefer carriage episodes with lengths close to the mean over very short carriage episodes. In particular, with this choice the median and mean of the Weibull distribution are approximately the same. In summary, the hazard rate for child i moving from state r to state s at time t was defined as The Portuguese DCC data referred to children from three rooms. Due to the clear temporal separation caused by the summer break, these were analysed as 6 rooms (3 rooms before and 3 rooms after the summer break). The Finnish children corresponded to a total of 7 rooms (2, 3 and 2 rooms the three DCCs). For simplicity, we only specified exposure and transmission within the same room and considered transmission between rooms as part of the general community exposure.
Exposure from the community was assumed to remain constant during the study period. The number of serotypes (n s ) the children were at risk of acquiring from contacts with people outside the rooms was assumed to be constant and equal to the total number of observed serotypes in the dataset (14 in Portugal,20 in Finland). This choice affects the estimates of community acquisition rates per serotype but has negligible effect on the overall rate of community acquisition.

Statistical methods
Adopting a Bayesian latent process approach, a likelihoodbased estimation of the model parameters θ = {β, κ, μ, φ} was applied using the same algorithm and setup as Hoti et. al [6] with a new modification which handles the left censoring of the first episodes of carriage at the start of the follow-up (see below). A Markov chain Monte Carlo (MCMC) algorithm was constructed to sample from the joint posterior distribution of the model parameters and carriage histories (unobserved events) compatible with the observed data, including the carriage histories of children with completely missing data (59% of attendees in the Finnish dataset; no child had completely missing data in the rooms of the Portuguese dataset).
Given the carriage histories, i.e. the initial state of carriage and all times at which the state changes for all the children in the rooms, the complete data likelihood can be calculated. For child i, let T r;s i denote the set of times the carriage status changes from state r to state s in the time interval ]t min , t max ], where t min is one day before the first nasopharyngeal sample and t max is the day after the last nasopharyngeal sample is taken in the dataset.
be the collection of all times child i changes carriage status. The likelihood function of the model parameters θ, based on data from n c individuals on the time interval ]t min , t max ] and defined by model (1), is The prior for each of the rate parameters (β, к, μ) was taken to be Normal with mean 0 and standard deviation 100, restricted on positive values, independently of the other parameters. The prior for φ was assumed to be Exponential with scale parameter 1/ln(2), corresponding to equal a priori probabilities for this parameter to be less or more than one.
In our model, the hazard of clearing carriage depends on the time of acquisition. Therefore, assuming that those found as carriers at the beginning of the follow-up had just started carrying may bias the estimation of the clearance rate as the durations of these episodes of colonisation appear shorter than they should. To correct for such left censoring of initial episodes of carriage at t min , we included the unobserved length ℓ of the carrying time preceding time t min into the model. The preceding time ℓ was assigned an Exponential prior distribution with mean 60 days. Sampling of ℓ was done by proposing a new value from the prior distribution and then accepting or rejecting it according to the Metropolis-Hastings algorithm. Since the proposal distribution was taken to be the same as the prior distribution, the acceptance ratio depended only on the complete data likelihood ratio.
The joint posterior distribution was defined as the product of the prior probabilities and the complete data likelihood (cf. [6]). Three separate MCMC chains of length 200,000 were realised for both datasets. The convergence was checked by inspection of the trace plots ( Figure 1). After discarding 10,000 initial iterations from each chain, the posterior distribution was investigated from the combined 570,000 iterations, separately for the two datasets. Parameter estimates were given in terms of their posterior means and 90% credible intervals.

Crude estimates
For comparison, crude estimates of the model parameters were calculated. These estimates were derived from a simplified model of acquisition and clearance by assuming that children acquired or cleared pneumococcus at most once between any two consecutive visits one month apart and that the events could take place only at the end of each one-month time period. Crude estimates of rates were calculated by dividing the appropriate numbers of events (acquisition or clearance) by the respective person-time at risk for the event in question, as summarised in the Appendix (see also [13]).

Model checking
To check the model fit, data were simulated from model (1) based on a subsample of size 30,000 from the posterior of parameters θ. For each of the rooms, the transmission process was simulated for the total number of attendees in the room, with the initial states sampled from the overall state distribution in the respective dataset. The process was simulated for one year, following which monthly samples were taken from all children, corresponding to the number of visits in that room in the dataset. The posterior predictive distributions of the prevalence and crude estimates of the transition rates were then compared to the actually observed prevalence and crude rates. For simplicity, the transitions relating to acquisition were

Results
Data summaries and exploratory analysis  Figure 2 shows the serotype distribution when the solates were ranked from the most common to least common one, separately for each room. The distributions are clearly skewed, with the most common serotypes in the room accounting for the majority of the isolates in that room. Figure 3 shows the longitudinal patterns in the observed numbers of isolates per room. Tables 2 and 3 present the observed numbers of one-month transitions in the Portuguese and Finnish datasets, respectively. Based on these data, the crude transmission ratesβ were 0.54 and 0.26 (per month) in the Portuguese and Finnish datasets, respectively. The overall crude rate of community acquisition n sκ ð Þ was 0.25 and 0.11 (per month) for Portugal and Finland, respectively. Thus, pneumococcal transmission appeared much stronger in the Portuguese dataset, both within the rooms and from the outside community. The Portuguese DCC attendees also appeared to clear colonisation slower than their Finnish counterparts, with a crude rate of clearing carriagê μ of 0.33 (per month) in Portugal and 0.54 (per month) in Finland. The crude estimates of the competition parameterφ were 0.80 and 1.14 based on the Portuguese and Finnish datasets, respectively.
When data from rooms F1 and F2 were reanalysed with a more sensitive method of detecting pneumococcal carriage, the proportion of positive samples increased from 27% (57/ 215) to 29% (62/215). The proportion of multiple carriage, i.e. samples with more than one isolate was 13% (8/62). Table 4 presents the posterior estimates of the four model parameters (see also Figure 4). The main findings are similar to those based on the crude estimates. The posterior mean transmission rate was almost double in Portugal (1.05 per month) compared to Finland (0.63 per month) with non-overlapping 90% credible intervals (0.82-1.31 vs. 0.47-0.79). The overall posterior mean community rate of acquisition was double in Portugal as compared to Finland (0.25 vs. 0.12 per month) with non-overlapping 90% credible intervals (0.18-0.34 vs. 0.09-0.15). The posterior mean of the clearance rate was lower in Portugal (0.57 per month) than in Finland (0.73 per month), but the difference was smaller than between the crude estimates. In general, the higher rates of clearance and transmission found in the full model, as compared to the crude estimates, means that the model allowed some events of acquisition and clearance to occur in between the visits even without direct observations in the samples. The posterior probability for the competition parameter (φ) to be less than one was 100% and 85% in the Portuguese and Finnish datasets, respectively. The posterior means were 0.51 and 0.67.

Model checking
Based on the posterior sample of the model parameters, the posterior predictive mean prevalence of pneumococcal  Table 1. carriage was 64% in Portugal (90% predictive interval between 56% and 72%) and 27% in Finland (90% predictive interval 19%-35%) ( Figure 5A). These corresponded well to the observed values (61% in Portugal and 26% in Finland). The predictive crude transition rates were also in accordance with the rates calculated directly from the data ( Figures 5B-D).

Discussion
In this article we explored what determines the prevalence of pneumococcal carriage in two European day-care settings. Parameters related to pneumococcal transmission were estimated from longitudinal studies conducted in Portugal and Finland. Differences in rates of transmission and clearance of carriage were identified as the main determinants of the observed differences in prevalence (61% in Portugal vs. 26% in Finland). Although the two studies differed in the ages of enrolled children (mean age of 2 years in Portugal vs. 4 years in Finland), this does not explain the whole extent of the prevalence difference since other studies have reported carriage levels higher than 60% among Portuguese children of age 4 years [17] and less than 40% in Finnish children of age 2 years [4]. In particular, pneumococcal transmission appears stronger in the Portuguese setting irrespective of age.
Both the rate of within-room transmission (1.05 vs. 0.63 per month) and community acquisition (0.26 vs. 0.12 per month) were twice as high in Portugal than in Finland. Cultural differences could explain the higher within-room intensity of transmission in Portugal, for example a longer time spent indoors or at the DCC. In both settings, children from the same DCC spent time together in the playground. Since we considered each of the DCC rooms separately, transmission between rooms of the same DCC would appear as community acquisition. Thus, the higher community acquisition in Portugal was probably a consequence of the Portuguese study subjects attending a larger DCC than children in any of the Finnish DCCs in this study. The credible intervals for the within-room transmission and outside acquisition were wide due to the strong (posterior) correlation between these two parameters: high within-room transmission together with low community exposure was as likely as vice-versa.
From the same dataset of Finnish DCCs, but complemented with samples from the children's families, Hoti et al. [6] estimated a slightly lower within-DCC transmission rate of 0.53 per month and a within-family transmission rate of 0.36 per month. A study of pneumococcal transmission in U.K. families, however, estimated a higher within-family transmission rate of 1.41 per month [18]. Three previous studies have reported only small differences in transmission rates across different serotypes [7,11,12]. The transmission rates within families varied between 0.64 and 0.84 per month (Bangladesh) [13] and within DCCs between 1.04 and 1.18 [12] and between 1.38 and 1.53 [11] (France). The two French studies may have over-estimated the transmission rates, since the first [12] assumed a fixed duration of carriage of 28 days, and the second [11] assumed that carriage needs to be cleared before colonisation by another serotype.
The duration of carriage (here defined as time until immune clearance, cf. [9]) was found to be longer in Portugal (estimated mean 47 days) than in Finland (37 days). The shorter duration in older children (mean age 4 years in Finland vs. 2 years in Portugal) is in agreement with The crude estimates of the model parameters are (cf. Appendix): the transmission rateβ ¼ 39 Àκ Â 374 ð Þ =60:18 ¼ 0:54 (per month); the community acquisition rate per serotype (in non-carrying children)κ ¼ 22=1236 ¼ 0:018 (per month); the clearance rate of carriageμ ¼ 55=165 ¼ 0:33 (per month); the competition parameterφ ¼ 24=1687 ð Þ = 22=1236 ð Þ¼0:80. For parameter β, the denominator (60.18) is not directly readable from the data as presented in the table. Only samples available from two consecutive visits were considered, accounting for 67% of the scheduled pairs of consecutive visits. The three rooms were analysed as six rooms, separated by the summer break. 1 Exposure is here defined as the presence of at least one carrier of the target serotype in the other attendees in the room. 2 Person-time here is the total time, in months, spent carrying pneumococcus overall (total row) or a specific serotype (serotype rows) stratified by the presence or absence of other carriers of the specific serotype. 3 Person-time here is the total time, in months, spent as a non-carrier, stratified by exposure to the specific serotype. For the total row, the time is a sum of serotype-specific person-times. 4 As in 2 but for children carrying pneumococcus. previous analyses [9,19]. Serotype 19 F [9,13] and serotype 6B [9] are typically found to be carried much longer. The fact that these serotypes were common in the Portuguese dataset but almost non-existent in the Finnish may have contributed to the estimated difference in the rates of clearance.
Similar durations of carriage were found in other studies with children under 3 years old, 48 days in Danish DCCs [8] and 51 days in English households [18]. Serotypespecific durations of carriage in Bangladeshi infants up to 1 year of age were estimated between 43 and 48 days for all serotypes except for 19 F, which was estimated to be carried on average 62 days [13]. Lipsitch et al. [9], who also estimated serotype-specific parameters, estimated the duration of carriage between 28 and 123 days. The mean duration decreased with age, 105 days in children less than 2 years old and 29 days in children between 3.5 and 5 years old.
Current carriage was found to reduce the subsequent acquisition rate by a factor of 0.5 in Portugal and 0.7 in Finland. The estimation of this competition parameter (φ), however, was sensitive to the choice of the prior distribution. In particular, the analysis warranted for a small prior probability for values very close to zero. In the current analysis the competition parameter was included mainly to adjust for possible confounding effects on acquisition by concurrent carriage. More data on the frequency of multiple carriage and more frequent sampling would have been necessary to learn adequately about competition [20].
The overall prevalence of pneumococcal carriage in the Finnish dataset was low, even when a more sensitive method of detecting pneumococcal carriage was applied on samples from one of the day care centres. Moreover, the proportion of multiple carriage in these samples (13%) was comparable to that in another study (10%) using the same detection method in day care children [16], although the prevalence in our study was clearly smaller (29% vs. 58%). This suggests that a low level of multiple carriage does not explain the lower prevalence and transmission rates in the Finnish data, although we did not have similar data from the Portuguese setting in this study. Moreover, it appears natural to assume that the dominant serotype is the one most likely transmitted as well as detected by sampling.
Although the prevalence of carriage and the total number of isolates were lower in Finland, the number of different serotypes was clearly larger in any of the Finnish DCCs than in the Portuguese DCC. The Portuguese children were on average younger than their Finnish counterparts (2 vs. 4 years), which could have affected the diversity of carriage: younger children carry the   1 Exposure is here defined as the presence of at least one carrier of the target serotype in the other attendees in the room. 2 Person-time here is the total time, in months, spent carrying pneumococcus overall (total row) or a specific serotype (serotype rows) stratified by the presence or absence of other carriers of the specific serotype. 3 Person-time here is the total time, in months, spent as a non-carrier, stratified by exposure to the specific serotype. For the total row, the time is a sum of serotype-specific person-times. 4 As in 2 but for children carrying pneumococcus. 5 Sampled only once in a child which was missing at the next visit.
common paediatric types while older children and adults carry a wide variety of rarer serotypes. Cobey et al. [10] found an increase in the diversity of carriage with age. The Finnish DCCs were also more scattered in the outskirts of an urban area, such that there was probably no transmission between the DCCs and clear serotype clustering occurred, with rare serotypes being exceptionally prevalent in samples from individual DCCs [6]. The Portuguese DCC was more connected to the community and other DCCs and, as such, more mixing could occur within and across DCCs resulting in less clustering of serotypes.  Not all children in the studied DCCs were sampled. This may have caused some bias, especially in one of Finnish DCCs in which only 26% of children were sampled. In the Portuguese DCC, only three rooms were included, corresponding to 22% of the children in the DCC. Although this ensured homogeneous ages among the study subjects, the sample may not be representative of the DCC as a whole. On the other hand, one month may be too long a sampling interval, especially for the Portuguese setting, in which the prevalence and the rates of transmission and community acquisition were higher. The optimal spacing between observations has been determined for a binary model, i.e. considering only carrier and non-carriers as possible states, and varied between 1 month for very low prevalence settings and 1 week in very high prevalence settings [21]. Nevertheless, the prevalence of carriage implied by the estimated parameters was close to the observed, and may be a good representation of the transmission dynamics. It would be interesting to replicate the study reported here to a larger set of comparable longitudinal studies from different prevalence contexts or countries so as to gain a better understanding of pneumococcal transmission differences.

Conclusion
We were successful at estimating realistic parameters for pneumococcal transmission, which were comparable among themselves and compatible with observed prevalences in two European settings (61% in Portuguese vs. 26% in Finnish day care children). The force of transmission in Portuguese children was found to be significantly higher than in the Finnish children. The community rate of acquisition, affected by the community structure, was also found higher in the Portuguese setting. The difference in carriage was explained by the higher rates of transmission and community acquisition as well as a lower rate of clearing carriage in younger DCC attendees in the Portuguese data.