The various geographical sets of units gave various insights into the spatial distribution of campylobacteriosis incidence. These differences can be attributed to a change in scale (e.g. for administrative units) or to a combination of changes in scale and zoning. Different criteria were used to compare the results obtained from different sets of geographical units, selected for their epidemiological relevance. First, the difference in estimated associations between campylobacteriosis incidence and exposure variables was explored, since different results can lead to a different understanding of the environmental factors influencing the spatial distribution of the disease. Next, residual clustering or hotspots of unexplained risk were explored. Local zones of unexplained risk were detected, which potentially identify different scales at which disease processes are occurring and specific areas and scales warranting further investigation.
Associations between campylobacteriosis and exposure variables
The number of significant variables in the CAR models was reduced as the level of aggregation increased. This was likely driven by a reduction of statistical power due to lower sample size rather than reduced strength of association as the point estimate in regression coefficients were in general larger in more aggregated data. It should be noted that, with the exception of covariates referring to animal density, significant effects on regression coefficients were only observed for those covariates showing less variability at lower level geographical units . Choice of unit based on minimal intra-zonal variance was suggested as a potential solution to the modifiable areal unit problem, as recently reviewed . The agriculture-based custom framework was created in order to increase the size of units to improve stability in rates, while preserving high intra-unit homogeneity in animal production. However, it has the drawback of being heterogeneous in unit size and many units had a non-compact shape, which might have hampered the ability to find statistically significant associations.
The direction of the associations was consistent and in agreement with current biological knowledge. Poultry and ruminants are frequently colonized with Campylobacter and can shed the bacteria in high numbers [10, 42–44]. Contamination in humans can then occur through direct contact with these animals or following indirect contact with a contaminated environment. We thus expected a positive association with a likely dose–response relationship, as we observed. For climate variables, the directionality of the association was also compatible with Campylobacter biology, which has reduced survival in a warmer environment and is sensitive to desiccation [45–49]. The inclusion of education as predictor variable was justified by a previous study reporting its influence on some steps needed for a case to be reported , with potential bias on spatial patterns. We observed a reduction in the incidence of campylobacteriosis with a lower level of education, which has also been reported by others and might represent a proxy for socio-economic status . Traveling and eating in a restaurant were reported as risk factors for campylobacteriosis in other countries, which are also associated with socio-economic variables including the level of education [51, 52]. To our knowledge, the importance of traveling as a risk factor for campylobacteriosis has not been evaluated in Canada, but is supported by a study conducted in a Canadian community, where about 22% of reported cases of campylobacteriosis were associated with international traveling . We could not exclude travel-related cases from our study because this information was missing for most cases. The biological pathway linking slaughterhouse presence to campylobacteriosis incidence is twofold: professional exposure of poultry slaughterhouse workers was previously reported to increase the risk of campylobacteriosis ; and slaughterhouse effluents were reported to harbor large quantities of Campylobacter , which could then contaminate the surrounding drinking or recreational water. However, we do not have a clear explanation for positive association between campylobacteriosis incidence and population density observed for some geographical units, apart from some effects due to differences in case reporting.
The slaughterhouse variable was associated with campylobacteriosis incidence, with a significant decrease in magnitude with aggregation and non-overlapping confidence intervals between geographical units. This met our expectation for this particular variable, since its effect would likely be limited to the nearby areas where workers live or environmental contamination occurs, and a reduction of the coefficient in the aggregation process would occur due to a dilution effect. We further explored these associations by multiplying the point estimate of large poultry slaughterhouse coefficients with the total populations at risk living in areas with large poultry slaughterhouses. In summary, the case load due to the presence of large poultry slaughterhouses in the study area ranged from 9 to 172 cases, depending on the framework. This illustrates the impact of choice of geographical units on potential public health conclusions.
For all other variables, the estimated coefficients of regression tend to increase in magnitude as the units become more aggregated, but this increase was not consistent across all geographical sets of units and thus not predictable. This increase is in accordance with previous work based on simple linear regression models, although another study conducted in a multivariate regression setting reported that the results were essentially unpredictable . For the animal density variables, the estimated regression coefficients with the highest values also had large standard errors, and were not statistically significant. Among the statistically significant ones, the variation in point estimates was too small in our perspective to affect epidemiological conclusions on the biological relevance of these variables. The only exception was perhaps for poultry density at the watershed level. At this level of aggregation, we suspected the presence of colinearity between poultry density and ruminant density based on our knowledge of the study area. Also, according to a simulation study, the aggregation process can introduce colinearity between exposure variables where none existed before . In order to explore this possibility, alternative models were built by excluding either the poultry density or the ruminant density from the model. For the watershed units, the exclusion of the poultry density variable led to larger and significant coefficients for the ruminant density variable (from 16.0 [p=0.22] to 23.1 [p=0.08] for ruminant density >20 per km2 and from 8.9 [p=0.12] to 15.9 [p<0.01] for ruminant density ≤250 per km2). This is suggestive of colinearity between these two variables, and so the effect attributed to poultry or ruminant density for frameworks at a larger scale is probably a cumulative effect of both types of production. Indeed, any epidemiological study aiming to study the distinct effect of poultry and ruminant density should avoid any use of geographical units at a scale larger than census consolidated subdivisions. From another perspective, the large regression coefficients for animal densities observed at the watershed scale could be indicative of real biological processes occurring at that specific scale, which is not impossible considering the importance of aquatic environments in the ecology of Campylobacter .
For the demographic and climatic variables, the point estimates of regression coefficients also increased with aggregation, although confidence intervals increased and point estimates were not statistically different between geographical units (i.e. point estimates for one geographical unit were included in the confidence intervals of estimates from other geographical units). It should be noted that high values of estimated regression coefficients for climate variables as seen for the “smallest” and “CLSC” geographical units compared to “municipality” are probably misleading. In fact, 51 and 39 of the divisions constituting these units were located on the islands of Montreal and Laval, respectively, urban areas characterized by warm temperatures and low amounts of precipitation. At the scale where it was measured, almost no variation in weather variables was seen within this area; indeed, these areas were probably over-represented in the estimated relationship between campylobacteriosis and weather variables.
Evaluation of the fit of models
The fit of models was explored in different ways. As previously reported, an increase of the correlation coefficient between observed and predicted data was seen as the level of aggregation increased. This is attributed to the smoothing effect of aggregation [25, 29] and highlights the fact that too much emphasis should not be placed on the interpretation of this measure. The absence of clustering in residuals was suggestive of a good adjustment for spatial dependence in all models. And finally, a small hotspot of unexplained residual risk was detected by the scan test for the three sets of geographical units at the smallest scales, i.e. the “smallest,” “municipality,” and “census consolidated subdivision” geographical units. Considering the small size of this hotspot, it was likely smoothed out in the aggregation process for geographical units at a large scale. A community episode of campylobacteriosis from a common source is a potential explanation for this high risk zone, or it could have been caused by a locally-acting factor not measured in our study. Molecular typing data on Campylobacter strains were not available from our dataset, but would have been very useful in distinguishing between these two processes, as we expect a higher similarity of strains isolated within a cluster in the presence of an outbreak from a common source . For two of the geographical units at a large scale (i.e. CLSC and census division), a large area of significant unexplained risk was detected. This area was characterized by high poultry production and relatively high predicted risk of campylobacteriosis. We do not have a clear explanation for this finding. It is possible that in areas of high animal density, the effect of environmental variables on campylobacteriosis risk is amplified in a non-linear way due to the dynamic of bacterial transmission between reservoirs. In more aggregated units, the areas included in the average animal density categories were less homogeneous, which might lead to poorer adjustment of the model. Geographically-weighted regression might be useful in exploring this possibility, but this was beyond the scope of our study.
The choice of framework and scales included in the study were limited to the ones at which cases could be located. Additionally, only cases reported through the surveillance system were included, limiting the generalization of results to all cases occurring in the population. In fact, case reporting involves multiple steps, which could be influenced by health care system, physician, or patient characteristics. In the province of Quebec, health care services are accessible to all, with universal health insurance plan allowing the entire population to receive free hospital and medical services, which should limit any related reporting bias. Regarding physicians, their propensity to ask for a stool sample have been reported to vary according to their speciality and severity of illness of their patients . Also, the decision of a diarrheic patient to consult a physician has been reported to depend upon severity of illness, urbanicity, and various socio-economic indicators of patients [50, 59]. Our models only offer partial adjustment related to patient characteristics, with the inclusion of an educational status variable. Thus, we cannot exclude that the observed association with environmental risk factors were biased by spatially-varying factors influencing steps conducting to the reporting of a case. However, as we used the same dataset for all geographical sets of units, the underreporting bias was kept constant and this should not invalidate our conclusions regarding the influence of geographical units on risk factor estimation. Also, our results could be dependent on the method used for statistical modeling. Using the Poisson regression model with spatially correlated random effects was explored for our study, but no convergence was obtained for most models. Our study design did not allow us to point out which mechanisms were exactly responsible for the differences observed. Theoretical studies on modifiable areal units assume that unit size does not vary much, but in our case we had small urban units with large rural units in most frameworks, and this could complicate interpretation.