The generic disease branch of the individual-based disease modeling software EMOD DTK v1.6 was used to model polio transmission. The model was developed in an iterative fashion, beginning with a generic, individual-based SEIR model, with age structure, seasonal forcing, and division into spatial metapopulations progressively added to the model to improve fits to the data. Only children under 5 years of age are included in the model, as these children account for ~98 % of total WPV1 paralytic cases in the dataset for northern Nigeria. The 2006 Nigerian census reports that 1.8 million children under 5 years old lived in Kano state at that time [9]. The simulated birth rate maintains a population growth rate of 2.8 % [10]. Our simulated population consists of approximately 800 k agents in 2006 (averaged over simulation instances, as vital dynamics are stochastic in this model), so that each simulated individual represents approximately 2.2 real individuals. Simulation with 1.8 million agents resulted in unacceptably long instance runtimes. Reducing the simulated population prevented this issue from occurring, and as the model implements frequency-dependent transmission, the mean dynamics should not directly depend on the number of simulated agents, as long as that number is well above the critical community size.

Upon clearance of infection or a successful immunization with oral polio vaccine (OPV), individuals are modeled as being completely immune to future infection. Challenge studies have shown that individuals are in fact susceptible to reinfection. However, the probability of reinfection is strongly reduced after a successful infection or vaccination; the metastudy presented in [11] estimates that the relative odds of shedding serotype 1 poliovirus post-challenge among OPV-vaccinated individuals compared with unvaccinated individuals is 0.13. [12] presents an odds ratio of 0.15, with 80 % of unvaccinated individuals excreting virus post-challenge, compared with 37 % of OPV-vaccinated individuals. The OPV-vaccinated individuals in [12] were also observed to shed for a mean of 4.6 days, as opposed to 20.4 days among the unvaccinated, and the mean titer of virus excreted by OPV-vaccinated individuals was over 3 orders of magnitude lower than unvaccinated controls; naturally immune individuals exhibited similar characteristics to the OPV-vaccinated group. The reduced chance of infection, shorter shedding period, and reduced shedding titer of OPV- and WPV-experienced individuals combine to provide support to the simplifying assumption that the observed disease transmission dynamics are driven primarily by naïve individuals, rather than by reinfection of previously individuals.

The historical SIA campaign calendar for northern Nigeria (see Fig. 8) is implemented in simulation. In calibration of this model, the effects of SIA campaigns are parametrized by a campaign coverage multiplied by assumed vaccine take rates for different vaccine types. The product of the two represents the fraction of susceptible children immunized in a single SIA round and is the relevant quantity being calibrated; this product is constrained to lie between 0 and 1. Model calibration figures will present the campaign coverage alone, as the vaccine take rates are assumed to be constant. Trivalent OPV campaigns are modeled with a 15 % probability of successful vaccination against type 1, while monovalent OPV1 and bivalent OPV are modeled with a 20 % success probability each; these numbers are taken from 30-day seroconversion rates to a birth dose in India [13]. The efficacy of the multiple types of OPV has been measured in numerous studies [11, 13–22], and the results vary considerably across studies. The numbers employed here are on the lower end of observed single-dose take rates. Circulation of attenuated vaccine-derived virus is not modeled. Each simulation begins on Jan 1, 1985, with a 5-year period in which infected individuals are stochastically imported into the network. The mean importation rate is one infected individual per week per 10 k population for these 5 years, after which the disease is sufficiently established to survive without importation in the absence of vaccination. The SIA campaign calendar was provided by the Nigerian WHO and [23–26].

### Model development and calibration

The acute flaccid paralysis (AFP) surveillance database, maintained by the Nigerian WHO, provides the basis for model calibration; the model was calibrated only to data from 2003–2009. Only AFP cases with WPV1 shedding confirmed by the Global Polio Laboratory Network [27] are included. Incremental mixture importance sampling (IMIS) [28] was used for parameter space exploration and calibration. Though IMIS applies to deterministic models and true likelihood functions, the authors find it to work quite well at calibrating stochastic models given an objective function, so long as parameter-based variation in the objective function outweighs the stochastic variance in the objective function at a particular parameter set, as is the case for most of the calibrations presented below. The value of the objective functions evaluated at a particular parameter set and stochastic instance will be referred to as the score. The objective functions themselves are described in detail in Additional file 1. The data targets are the time series of case counts, binned twice per month; the distribution of age at onset of paralysis, binned into 6-month age bins; and the spatial distribution of cases at the LGA level.

The model was developed in an iterative fashion, beginning with a generic, individual-based SEIR model, and adding age structure, seasonal forcing, and division into spatial metapopulations progressively to improve fits and target additional features of the data. The model development and calibration section is split into subsections describing the reasoning and calibration at each step of development, beginning with a very simple (and poor) model and progressing through a couple of intermediate models before arriving at the finalized model. The reader may wish only to see the specification and calibration of the finalized model and results, contained in Additional file 1– Model Specification, Model Development – Spatial Model, and Results, respectively.

### Model development – Basic SEIR model

The incubation and infectious periods, base reproductive number, and the campaign coverage of a basic SEIR model were fit to these data. Campaign coverage is assumed to be constant over this time period, and campaigns are modeled as being randomly accessed by individuals, with no structured heterogeneity in individuals’ probabilities of being vaccinated in a single campaign. The infectious and incubation periods are of fixed duration, rather than exponentially distributed. The choice of a fixed duration maintains the benefit of having only a single parameter to calibrate for each, like the exponential distribution, while removing that distribution’s probability of individuals drawing unrealistically short infectious durations.

The case-to-infection ratio of WPV1 in immunologically naïve individuals is approximately 1:200 [29]. For the purposes of calibration, the modeled incidence over time is fit to the case count data using a Poisson approximation to a binomial objective function, which is maximized over an unconstrained case-to-infection ratio (for an investigation of overdispersion in the case counts relative to the model, and how this affects the results, please see Additional file 1, Overdispersion section). This construction provides a scale-independent fit of the shape of the modeled incidence curves to the case counts over time. In the high-incidence periods used for calibration and with a large simulated population, the average dynamics are quite insensitive to the total modeled population. More details about the objective functions employed are included in Additional file 1.

Figure 1 presents the results of this calibration. Figure 1(a) and (b) show that the objective score is strongly dependent on R_{0} and SIA coverage, but relatively weakly dependent on the incubation and infectious periods; the space around 4 days exposed and 5 days infectious is preferred, but individual samples of comparably high score are present throughout the 2D space. Figure 1(c) and (d) present the target data as blue bars. The red contours indicate the 68 %/95 %/99 % quantiles of distributions constructed by weighting the outputs according to their score (were the objective functions true likelihoods, these would represent quantiles of the posterior distribution; as they are not, the statistical interpretation of these contours is not immediately obvious, but they serve to demonstrate how well the calibrated model reproduces the targeted features of the data). These figures demonstrate that the capability of this model to fit the data is rather poor; the age distribution of infection (as expected) is monotonically decreasing, in contrast to the data which peaks in the 1.5–2 year age bin. The simulated case counts over time do exhibit a low-amplitude oscillation with the 2-year periodicity exhibited by the data, but fail to reproduce the high amplitude of the outbreaks and long troughs of very low case counts in the data.

### Model development – Age dependent exposure and seasonality

The poor performance of a basic SEIR model is not unexpected. In a homogenously mixed SEIR model, new infections will be distributed uniformly among the susceptible population, the proportion of susceptibles will decline with age as older age groups have been exposed for longer, and the distribution of new infections will thus decline with age as well, as seen in the simulation results of Fig. 1(a). Similarly, oscillatory behavior in a homogenously-mixed SEIR models is expected to damp toward an equilibrium value; stochastic resonance can drive oscillations at the natural frequency of the system, but for the large population simulated here these oscillations are insufficiently large to explain the data. This model is presented primarily for comparison against more developed models, to demonstrate the value of additional data-driven complexity.

A number of mechanisms could potentially explain the discrepancy in the simulated age at infection distribution and the observed age at onset of paralysis. Age-dependence in the case-to-infection ratio has been reported in the past [30]; however, the measurements indicate no more than a factor of 2 difference over the first 5 years of life, insufficient to transform the simulated distribution into the data. Maternal antibodies are another obvious potential explanation, but previous work indicates that these wane with a half-life on the order of 3–4 weeks [31–35], much too rapidly to solely explain a rising attack rate over the first 2 years of life. The hypothesis implemented into the model is reduced mixing by young children, such that the force of infection is reduced by an age-dependent factor. For simplicity’s sake, this age-dependent scaling factor is assumed to be multiplicative, linearly dependent on age, and bounded between 0 and 1 (where 0 represents no exposure to infectivity, and 1 represents full exposure. To clarify, let *β*(*a*, *t*) represent the force of infection from all infectives onto susceptibles of age *a*, then:

$$ \beta \left(a,\ t\right) = \left\{\begin{array}{ll}\kern2.5em 0,\hfill & {f}_0+{f}_1a\le 0\hfill \\ {}{\beta}_0\left({f}_0+{f}_1a\right),\hfill & 0 < {f}_0+{f}_1a<1\hfill \\ {}\kern2.25em {\beta}_0,\hfill & {f}_0+{f}_1a\ge 1\hfill \end{array}\right. $$

The model is recalibrated, with the slope and age-0 intercept of this scaling factor included as additional dimensions in the calibration.

Figure 2 shows the results of this calibration. The simulated age at infection distribution depends most strongly on the base reproductive number and the slope and intercept of the age-dependent mixing coefficient. Figure 2 only presents the score in the 2D space of R_{0} and the slope of the age-dependent mixing; the most likely points for the intercept are clustered near 0, as expected due to the very low case counts in the 0–6 month age bin. It should be noted that this reduced mixing induces a shift in the true R_{0} that depends on the age-dependent mixing factor and the age distribution of the population; for the remainder of this paper, R_{0} (or base reproductive number in figure axes) will refer to the daily force of infection times the infectious period, which represents the R_{0} within a fully mixed population.

For subsequent calibrations and model runs, the intercept of age-dependent mixing will be fixed at 0 (see Fig. 2 panel b), and the relationship between the slope and the base reproductive number will used to collapse these two onto a single dimension; that is, R_{0} will remain a freely fitted parameter, but at a given R_{0}, the age-dependent mixing slope is determined by the fitted curve shown in Fig. 2a.

The next model iteration aims to address the strong periodicity and high amplitude of WPV1 outbreaks, by adding a seasonal forcing term to the infectivity:

$$ \beta \to \beta \left(1+A\left( \cos \left(\frac{t-{t}_0}{365}\right)\right)\right) $$

Seasonal outbreaks have been observed in many infectious diseases [36–43], and seasonally-forced models have been shown to exhibit bifurcations from annual outbreaks to dynamics with longer periodicities [44, 45]. The 2-year periodicity of WPV1 outbreaks in Kano state constrains the space of combinations of base reproductive number, SIA campaign coverage, and seasonal forcing amplitude.

Figure 3 shows the results of this calibration procedure. The fit to the case count data is substantially improved by the addition of seasonal forcing to the model, and the fit to the age distribution remains good. The overall peak score has increased dramatically; compare the color bar limits between Fig. 3 and Fig. 1. The structure of the objective function presents some interesting features, with multiple broad regions of high score. The incubation period and infectious period are not strongly constrained by the fits, but existing studies indicate a very short delay from exposure to the onset of viral excretion [46, 47]. When the calibration space is restricted to include only incubation periods between 1 and 5 days, the score-weighted mean infectious period is 27 days, a value which is also within the range supported by shedding studies [12, 46, 48]. Inclusion of these features appears to have improved the fit of these biological parameters relative to the SEIR model without these additional features, which inferred a mean infectious period of only a few days, Fig. 1. The seasonal forcing model is best fit with an amplitude of 0.11 and a peak on April 10. These values will be fixed in future calibration runs, while R_{0} and the mean SIA campaign coverage remain as calibration parameters.

As noted early in this section, the time series of cases from data and infections from simulation are compared using a scale-independent fit, with the case-to-infection ratio representing a free scaling parameter in the fit. While this simple SEIR model with minimal added complexity is able to reproduce the outbreak dynamics, the weighted mean case-to-infection ratio is about 1:1650, much lower than the expected 1:200 ratio for WPV1. Assuming the expected 1:200 ratio, the model overpredicts the observed number of cases by a factor of approximately 8 on average. This discrepancy may indicate the existence of some unmodeled complexities. Multiple hypotheses could all plausibly contribute to this discrepancy: The assumption of complete immunity after infection with OPV or WPV rather than the biologically justifiable partial immunity and waning immunity could allow for some degree of transmission to proceed through individuals who are less likely to present paralysis (though for reasons given previously, the authors expect this contribution to transmission to be small), individual-level heterogeneity in transmission (through host factors or disparate social networks), and the treatment of Kano as a single mixing population, rather than accounting for the spatial distribution of the population and how this affects transmission. The model development proceeds by investigating this last hypothesis, though all are likely to contribute.

### Model development – Spatial model

The polio case dataset used for calibration reports cases at the scale of local government areas (LGAs), providing spatial information that may also be incorporated into model calibration, which requires splitting the single-population model presented above into a set of metapopulations. Construction of the spatial metapopulation model is covered in detail in Additional file 1. WorldPop [49, 50] population maps are the basis for metapopulation construction. Spatially contiguous features of high population density are identified and aggregated into population centers, with the total population placed at the population-weighted mean latitude and longitude. Any remaining low-density background population is aggregated to the nearest population center. Nodes with total populations below 1000 individuals are merged with their nearest neighbor. This process results in 1322 nodes. Each of these population centers is a separate metapopulation, or node, in the model.

Nodes are internally well-mixed (with the age-dependent mixing described above), and coupled to each other through migration of individuals. Individuals are assumed to take only round-trip migration, with an average duration of 1 day at the destination node. Individual migration rates are assumed to follow a gravity-like model, with the rate linearly increasing with destination population, inversely linear with the distance to the destination node, and independent of the population of the home node.

$$ {R}_{ij}=A*\frac{p_j}{d_{ij}} $$

The migration rate is implemented in units of per-individual per day, so that the total flow from node *i* to node *j* is implicitly linear in the population of node *i.* The network is not completely connected; migration from a node only occurs to its 8 nearest nodes and its 30 remaining most probable destinations.

Calibration of the spatial model covers a 5-dimensional parameter space including the coverages of routine immunization (RI) and campaign immunization, the base reproductive number, a coefficient of variation in campaign coverage (though not RI) at the LGA level, and the scaling constant of the overall rate of migration.

The spatial parameters added to the model necessitate the addition of components to the objective function that are sensitive to spatial effects. The new target data include the number of LGAs reporting at least one case within a 3 month time window, and the annual case counts in each of the 44 LGAs of Kano state. The objective functions do not aim to match specific LGAs between simulation and data, but target the general heterogeneity in LGA-level case counts.

Results of the calibration are presented in Fig. 4 (except for the age of infection distribution, suppressed as the fit is of similar quality to those presented previously). Panels a-c present the log-score distributions in 2D planes similar to those shown previously. The best-fit campaign and RI coverages both increase with infectivity, as expected. The LGA-level coefficient of variation of the campaign coverage is not strongly constrained, though the scaling factor on the migration rate is constrained to lie within approximately 1 order of magnitude. To provide an interpretation for this result, Fig. 5 presents a histogram of the unscaled migration rates by node. Scaling this histogram by the high-score region in Fig. 4(c) reveals that the best fits occur when individuals in most nodes have outbound travel frequencies from about once per 3 days to once per 30 days.

The inferred case-to-infection ratio from the metapopulation model was approximately 1:1200. While still much higher than the expected 1:200, this does represent an improvement in the overall scale of the simulated outbreaks when compared with the single-population model. Other unmodeled heterogeneities may be responsible for the continuing discrepancy; perhaps the modeled population may be viewed as an undervaccinated subpopulation or a subpopulation at high risk relative to the total population.

After calibration of the spatial model, the model is applied to the question of elimination. At the time of writing, no cases of WPV1 paralysis have been observed in Nigeria since July 24, 2014. The WHO standard for certifying interruption of WPV in a region is 3 years without an observed paralysis case. Other models have been applied to the question of elimination probability given a case-free period [5–8, 51], and this model is used to address the same question.

The calibration parameter space from the 2003–2009 calibration is resampled for each simulation, with the priors constricted to reflect the regions of high score in Fig. 4. At this stage, the fitting timeframe is extended to include all data until the most recently observed WPV1 case, and the simulation timeframe is extended to 2020. Importation pressure from other states in northern Nigeria is added to the model because Kano state underwent a period of 20 months in 2009–2010 without a single WPV1-positive paralysis case. This is a low-probability event in this model and indicates the possibility of a local elimination followed by reimportation.

Increasing campaign coverage over time is also added to the model for this investigation, as the available data on campaign coverage in Kano indicates that campaign coverage improved in the late 2000s and into the 2010s (lot quality assurance sampling (LQAS) data, provided by the Nigerian WHO). In this exploration, campaign coverage is assumed to increase linearly with time, beginning at some time between Jan 1, 2005 and Jan 1, 2010. After the last campaign in the historical and planned campaign calendar (Dec. 2015), campaigns are assumed to use bOPV or mOPV1, with campaigns occurring at a constant rate of 9/year.

Finally, the scale-free fitting utilized in calibration is not appropriate for addressing the question of pathogen elimination, as the size of the mixing population and the case-to-infection ratio affect how long infection can silently transmit. Therefore, the simulated population is reduced to match the scale of the simulated infections with the observed paralysis data at a case-to-infection ratio of 1:200. The final simulated population is ~284000 in Mar 2006, compared to the census population of 0–5 year olds of 1.8 million in the same year. Figure 7 demonstrates that the quality of the calibration fits remains comparable to the larger population simulations.