Data sources
We conducted this study through extensive analysis of TB case notification data from the Kenya national TB control program database. The database is an elaborate and robust surveillance system that captures case notification data from the health facilities in every county and updates the records on the national grid. For the process of data capture into the surveillance system, the National TB program adapted both the recording and reporting tools from WHO. The WHO recommends systematic screening for HIV among TB patients; our dataset captures the HIV status of all the TB case notifications. We analyze the data aggregated at the county level.
Model description
For the county s in the year t, we modeled the TB-HIV cases notification yst as
$$ {\mathrm{y}}_{\mathrm{st}}\sim \mathrm{Poisson}\left({\uplambda}_{\mathrm{st}}\right) $$
We assumed our count data follows the Poisson distribution where the log of the relative risks was the focus of modeling. We defined the mean λst in terms of the unknown relative risk and expected number of co-infection cases. That is, λst = ρstEst.
We defined the population at risk of TB-HIV co-infection are the TB cases. We computed the expected counts of co-infection cases Est per county per year. These counts represent the number of cases one would expect if the population of county s has similar behavior to the standard population. Our statistical consideration for the standard population N was the average of the pooled TB cases, that is, \( \mathrm{N}=\frac{\mathrm{P}}{\mathrm{Y}} \), where P is the total number of TB cases at risk of co-infection and Y is the number of years, which is seven for this study. We calculated the crude rate as \( {\mathrm{R}}_{\mathrm{st}}=\frac{\sum {\mathrm{X}}_{\mathrm{st}}}{{\mathrm{P}}_{\mathrm{st}}} \) where ∑Xst and Pst are the number of co-infection cases and the number of TB cases in county s, year t respectively. We then multiplied the crude rate by the standard population to obtain the expected number of co-infection cases
$$ {\mathrm{E}}_{\mathrm{st}}={\mathrm{R}}_{\mathrm{st}}\times \mathrm{N} $$
We expressed the linear predictor on the logarithmic scale, ηst = log(ρst) which is the recommended invertible link function for the Poisson family of distributions. We compared the spatiotemporal disease models discussed by [23]. The models differed in their formulation of the space-time structure and the inclusion or not of the covariates. Model 1a applied the classical parametric formulation of [24] on the linear predictor, which we expressed as
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+\left(\uprho +{\updelta}_{\mathrm{s}}\right)\times {\mathrm{Z}}_{\mathrm{t}} $$
(1a)
The formulation included the spatially structured (υs) and unstructured (νs) random effects, the global linear time trend effect (ρ × Zt). The term δs × Zt is the interaction term between space and time defining the difference between ρ and the area-specific time trend. It is referred to as the differential trend of the sth area [23, 24]. The term Zt is a vector of temporal weights and the intercept α quantifies the average co-infection rate in all the 47 counties. Each spatial unit has its own time trend with a spatial intercept (α + υs + νs) and a slope (ρ + δs). This model assumes a linear time trend in each spatial unit. We estimated the parameters θ = {α, ρ, ν, υ, δ} and the hyper-parameters ψ = {τν, τυ, τδ}.
The model 1b included the covariates to the model 1a thereby estimating θ = {α, β, ρ, ν, υ, δ} and ψ = {τν, τυ, τδ}. The model expression was
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +\sum {\upbeta}_{\mathrm{i}}{\mathrm{x}}_{\mathrm{i}}+{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+\left(\uprho +{\updelta}_{\mathrm{s}}\right)\times {\mathrm{Z}}_{\mathrm{t}} $$
(1b)
The model 2a used the dynamic non-parametric formulation on the linear predictor
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+{\upgamma}_{\mathrm{t}}+{\upphi}_{\mathrm{t}} $$
(2a)
The terms α, υs and νs are similar to the formulation in the first model, additionally, the terms γt and ϕt represents the temporally structured and unstructured random effect respectively. The model assumes a non-parametric time trend. In this formulation, θ = {α, ν, υ, γ, ϕ} and ψ = {τν, τυ, τγ, τϕ}.
The model 2b incorporated the covariates to the model 2a to estimate θ = {α, β, ν, υ, γ, ϕ} and ψ = {τν, τυ, τγ, τϕ}. We expressed model 2b as
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +\sum {\upbeta}_{\mathrm{i}}{\mathrm{x}}_{\mathrm{i}}+{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+{\upgamma}_{\mathrm{t}}+{\upphi}_{\mathrm{t}} $$
(2b)
Our model 3a expanded the model 2a by allowing a space-time interaction to explain for the difference in the time trend of TB-HIV coinfection for the diverse counties.
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+{\upgamma}_{\mathrm{t}}+{\upphi}_{\mathrm{t}}+{\updelta}_{\mathrm{s}\mathrm{t}} $$
(3a)
For this model, θ = {α, ν, υ, γ, ϕ, δ} and ψ = {τν, τυ, τγ, τϕ, τδ}. We defined δst as the interaction between νs and ϕt consequently assuming no interaction between υs and γt therefore δst~N(0, τδ).
The final model 3b incorporated the covariates to the model 3a to estimate θ = {α, β, ν, υ, γ, ϕ, δ} and ψ = {τν, τυ, τγ, τϕ, τδ}. We formulated the model as
$$ {\upeta}_{\mathrm{s}\mathrm{t}}=\upalpha +\sum {\upbeta}_{\mathrm{i}}{\mathrm{x}}_{\mathrm{i}}+{\upupsilon}_{\mathrm{s}}+{\upnu}_{\mathrm{s}}+{\upgamma}_{\mathrm{t}}+{\upphi}_{\mathrm{t}}+{\updelta}_{\mathrm{s}\mathrm{t}} $$
(3b)
To assess the performance of these six models, we used the DIC taking into consideration the complexity of the models. We selected the model with the lowest DIC as the best-fit model.
Baseline predictor variables
The set of baseline predictors were poverty index, infrastructure index, health index, education index, gender ratio, dependency ratio, and Gini coefficient. These predictors are standard indices used to establish the comparative level of development of different counties in Kenya. The computation of these indices is further elaborated in the reports from [25] and [26]. All these predictor variables were fitted in the models 1b, 2b and 3b but only the significant ones were considered in the discussion.
The poverty index provides a measure of the inadequate consumption of is services and fundamental rights. In other words, it estimates the disparities in resource expenditures for each county. The infrastructure index captures access to natural resources, economic growth, and innovative planning. The health index measures access to medical services, adequate medical workforce, and improved medical productivity. The education index captures the literacy attainment, completion and dropout rate. The gender inequality index reflects the bias in reproductive health, empowerment and labor market between men and women. The dependency ratio gives an indication of the burden of the working population and government to support the non-working population who are either too young or too old. The Gini coefficient compares the distribution of income in the entire population of any given county. It is based on the Lorenz curve and varies between 0 (complete equality) and 1 (complete inequality).
Statistical analysis
For the demographic characterization of the case notifications, we compared the summaries of TB cases with and without HIV infections. We stratified the data based on HIV status and performed the chi-square test to determine the association between HIV status and each of the demographic variables TB-type, age, gender, and patient type. All the p-values were two-tailed with values less than 0.05 considered being statistically significant. The TB type classification was either pulmonary TB or extra-pulmonary TB. Pulmonary TB referring to a patient with TB disease involving the lung parenchyma whereas the extra-pulmonary TB involves any organ other than the lungs. For the patient type, we had five categories; the first was the default category for patients who defaulted the TB therapy then experienced recurrence. The second was the failed category for patients previously diagnosed with TB but never took on the therapy. The third category was for newly diagnosed patients without previous TB diagnosis or therapy. The relapse case was the fourth category whereby patients were previously diagnosed, treated of TB and completed the TB therapy but experienced a recurrence. The fifth and final category was the cases transferred in from other health facilities to continue with the therapy.
Using the Integrated Nested Laplace Approach (INLA), we fitted the case notification data to our spatiotemporal disease models to determine the best fit. We assessed the nature of the response variables on our baseline predictors. We specified the Besag-York-Mollie (BYM) prior on υs using the intrinsic conditional autoregressive structure (iCAR). Thus \( {\upupsilon}_{{\mathrm{s}}_{\mathrm{i}}}\mid {\upupsilon}_{{\mathrm{s}}_{\mathrm{i}}\ne {\mathrm{s}}_{\mathrm{j}}}\sim \mathrm{N}\left(\frac{\sum_{\mathrm{j}\upepsilon \mathrm{N}\left(\mathrm{s}\right)}{\upupsilon}_{{\mathrm{s}}_{\mathrm{j}}}}{\#\mathrm{N}\left(\mathrm{s}\right)},\frac{\upsigma_{\upupsilon}^2}{\#\mathrm{N}\left(\mathrm{s}\right)}\right) \) where #N(s) is the number of neighbors sharing boundaries with the county si. The BYM model allows us to capture both the heterogeneity (variability) and clustering of disease risk simultaneously. We then used the exchangeable prior on νs, that is \( {\upnu}_{\mathrm{s}}\sim \mathrm{N}\left(0,{\upsigma}_{\upnu}^2\right) \). We modeled γt using a random walk specified through the temporal adjacency structure, which is analogous to the spatially structured random effects specification as it borrows strength from adjacent time periods. The temporally unstructured random effect ϕt was modeled using the Gaussian exchangeable prior ϕt~N(0, τϕ). We defined improper priors for the intercept and regression coefficients of the fixed effects as α~N(0, 0) and β~N(0, 0.001) respectively. For the distribution of the hyper-parameters, we assumed the default specifications of INLA whereby we assigned minimally informative priors on the log of the precision of both the structured and unstructured effects ψ~(1,0.0005).