### Calculation of the global invasion threshold

The reproductive number *R*
_{0} provides a threshold condition for a local outbreak in the community; if *R*
_{0} > 1 the epidemic will occur and will affect a finite fraction of the local population, otherwise the disease will die out [23]. The condition for global invasion is however made more complicated by the interplay between the local transmission dynamics and the mobility process that is responsible to seed non-infected subpopulations. Even in the occurrence of a local outbreak, given *R*
_{0} > 1, the epidemic may indeed fail to spread spatially if the mobility rate is not large enough to ensure the travel of infected individuals to other subpopulations before the end of the local outbreak, or if the amount of seeding cases is not large enough to ensure the start of an outbreak in the reached subpopulation due to local extinction events. All these processes have a clear stochastic nature and they are captured by the definition of an additional predictor of the disease dynamics, *R*
_{*} > 1, regulating the number of subpopulations that become infected from a single initially infected subpopulation [28, 43–46], analogously to the reproductive number *R*
_{0} at the individual level. An expression for *R*
_{*} has been found in the case of metapopulation epidemic models with different types of mobility processes, including homogeneous, traffic-driven, and population-driven diffusion rates [28, 46], commuting-like processes [47, 48] and origin–destination processes with adaptive behavior [49] or heterogeneous length of stay at destination [50]. In all those cases the population is assumed to mix homogeneously and to travel according to rates that are uniform across individuals. Here we go beyond those assumptions and examine the relationship between the occurrence of a global outbreak and the age-dependent transmission dynamics coupled with the age-specific travel behavior, through the calculation of the global invasion threshold *R*
_{*}.

Let us consider the invasion process of the epidemic spread at the metapopulation level, by using the subpopulations as our elements of the description of the system. We assume that the outbreak starts in a single initially infected subpopulation of a given degree *k* and describe the spread from one subpopulation to the neighbor subpopulations through a branching process approximation [51]. We denote by {D}_{k}^{n} the number of infected subpopulations of degree *k* at generation *n*, with {D}_{k}^{0} being the initially seeded subpopulation, {D}_{k}^{1} the subpopulations of degree *k* of generation 1 directly infected by {D}_{k}^{0} through the mobility process, and so on. By iterating the seeding events, it is possible to describe the evolution of the number {D}_{k}^{n} of infected subpopulations as follows:

\begin{array}{l}{D}_{k}^{n}={\displaystyle \sum}_{k\text{'}}{D}_{{k}^{\text{'}}}^{n-1}\left({k}^{\text{'}}-1\right)P\left(k|{k}^{\text{'}}\right)\phantom{\rule{1pt}{0ex}}\left(1-{\displaystyle \sum}_{m=0}^{n-1}\frac{{D}_{k}^{m}}{{V}_{k}}\right)\xb7\\ \phantom{\rule{8.3em}{0ex}}{\mathrm{\Omega}}_{k\text{'}k}\left({\lambda}_{{k}^{\text{'}}k,c},{\lambda}_{{k}^{\text{'}}k,a}\right)\end{array}

(5)

The r.h.s. of equation (5) describes the contribution of the subpopulations {D}_{k\text{'}}^{n-1} of degree *k* ' at generation *n* − 1 to the infection of subpopulations of degree *k* at generation *n*. Each of the {D}_{k\text{'}}^{n-1} subpopulations has *k*
^{'} − 1 possible connections along which the infection can spread. The infection from {D}_{k\text{'}}^{n-1} to {D}_{k}^{n} occurs if: (i) the connections departing from nodes with degree *k*' point to subpopulations with degree *k*, as ensured by the conditional probability *P*(*k*|*k*
^{'}); (ii) the reached subpopulations are not yet infected, as indicated by the probability \left(1-{{\displaystyle \sum}}_{m=0}^{n-1}\frac{{D}_{k}^{m}}{{V}_{k}}\right), where *V*
_{
k
} is the number of subpopulations with degree *k*; (iii) the outbreak seeded by {\lambda}_{{k}^{\text{'}}k,c} and {\lambda}_{{k}^{\text{'}}k,a} infectious individuals, children and adults, respectively, traveling from subpopulation *k’* to subpopulation *k* takes place, and the probability of such event is given by {\mathrm{\Omega}}_{k\text{'}k}\left({\lambda}_{{k}^{\text{'}}k,c},{\lambda}_{{k}^{\text{'}}k,a}\right). The latter term is the one that relates the dynamics of the local infection at the individual level to the coarse-grained view that describes the disease invasion at the metapopulation level. It also provides the contribution of children and adults age classes, thus including the effects of non-homogeneous travel behaviors and mixing patterns. The numbers of infectious individuals of each type flying from a subpopulation with degree *k*' and arriving to a subpopulation with degree *k* during the entire duration of the outbreak are given by:

\begin{array}{l}{\lambda}_{{k}^{\text{'}}k,c}={d}_{{k}^{\text{'}}k,c}\xb7\frac{{\mathit{z}}_{\mathit{c}\phantom{\rule{0.25em}{0ex}}}{\mathit{N}}_{{\mathit{k}}^{\text{'}},\mathit{c}}}{\mu}=\frac{\mathit{r}{\mathit{d}}_{\mathit{k}{\mathit{k}}^{\text{'}}}}{\alpha}\xb7\frac{{\mathit{z}}_{\mathit{c}\phantom{\rule{0.25em}{0ex}}}\alpha {\mathit{N}}_{{\mathit{k}}^{\text{'}}}}{\mu}\\ \phantom{\rule{3.4em}{0ex}}=\mathit{r}{\mathit{d}}_{\mathit{k}{\mathit{k}}^{\text{'}}}\frac{{\mathit{z}}_{\mathit{c}\phantom{\rule{0.25em}{0ex}}}{\mathit{N}}_{{\mathit{k}}^{\text{'}}}}{\mu}\end{array}

(6)

{\lambda}_{{k}^{\text{'}}k,a}={d}_{{k}^{\text{'}}k,a}\xb7\frac{{\mathit{z}}_{\mathit{a}\phantom{\rule{0.25em}{0ex}}}{\mathit{N}}_{{\mathit{k}}^{\text{'}},\mathit{a}}}{\mu}=\left(1-\mathit{r}\right){\mathit{d}}_{\mathit{k}{\mathit{k}}^{\text{'}}}\frac{{\mathit{z}}_{\mathit{a}\phantom{\rule{0.25em}{0ex}}}{\mathit{N}}_{{\mathit{k}}^{\text{'}}}}{\mu}

i.e. the final proportion *z*
_{
i
} of the {N}_{{k}^{\text{'}},i} hosts who contract the infection and diffuse with rate {d}_{{k}^{\text{'}}k,i} during their infectious period, with *i* = *c*, *a*. *z*
_{
c
} and *z*
_{
a
} indicate the attack rates in the children and adult age classes, respectively, and they are given by the solution to 1-{z}_{i}=\mathrm{\text{exp}}\phantom{\rule{1pt}{0ex}}\left(-{{\displaystyle \sum}}_{j}{R}_{\mathit{\text{ij}}}{z}_{j}\right)[52]. Figure 3A shows the behavior of the final attack rates *z*
_{
c
} and *z*
_{
a
} as a function of the reproductive number *R*
_{0}, considering the partition of the population in children and adults observed in the 8 European countries of the Polymod dataset [22] and their mixing properties. Variations between countries, depending on the age profile and the mixing patterns, are observed. Countries are generally predicted to have significantly larger epidemic sizes in children, as shown, for example, by the case of Italy. This would correspond, on average, to a larger number of individuals in the children class that could potentially seed other subpopulations and sustain the spatial invasion. However this effect is counterbalanced by the age-specific traveling probabilities that are much lower in the children class. In the case of Belgium the final epidemic sizes in the two classes are found to be almost equal, with the size of the epidemic in the adult population being slightly larger than the one in the children population. This is the only country in the dataset under consideration that has a ratio *η* larger than 1, indicating a larger average number of contacts established by adults with respect to children, likely induced by the specific survey methodology adopted [22].

If we indicate with *π*
_{
c
} (*π*
_{
a
}) the probability of extinction given that a single infected individual of class *C* (*α*) is introduced in the population, the probability {\mathrm{\Omega}}_{k\text{'}k}\left({\lambda}_{{k}^{\text{'}}k,c},{\lambda}_{{k}^{\text{'}}k,a}\right)\phantom{\rule{0.25em}{0ex}} that {\lambda}_{{k}^{\text{'}}k,c} and {\lambda}_{{k}^{\text{'}}k,a} infectious individuals traveling from subpopulation *k’* to subpopulation *k* would start an outbreak can be expressed as \phantom{\rule{0.25em}{0ex}}{\mathrm{\Omega}}_{k\text{'}k}\left({\lambda}_{{k}^{\text{'}}k,c},{\lambda}_{{k}^{\text{'}}k,a}\right)=1-{\pi}_{c}{}^{{\lambda}_{{k}^{\text{'}}k,c}}{\pi}_{a}{}^{{\lambda}_{{k}^{\text{'}}k,a}}. Here the two processes are considered as independent since we assume a multi-type branching process approximation. The probabilities of epidemic extinction given the introduction of a single case are determined as the solutions of the quadratic system dependent on the elements *R*
_{
ij
} of the next generation matrix [14], *π*
_{
i
} = [1 + *R*
_{
ii
}(1 − *π*
_{
i
}) + *R*
_{
ji
}(1 − *π*
_{
j
})]^{− 1}, with *i* = *c*, *a*. If *R*
_{0} < 1, the only solution is *π*
_{
c
} = *π*
_{
a
} = 1, i.e. the epidemic dies off. Otherwise, the system has solutions (*π*
_{
c
}, *π*
_{
a
}) in the range [0,1], as shown in Figure 3B. All countries (except Belgium) display a larger probability of extinction related to the introduction of a single adult case in the population, with respect to the introduction of a single infectious child. Given the mixing patterns, children are therefore more likely to start an outbreak than adults. *π*
_{
a
} ranges between 96.5% in the case of Italy and 99% in the case of Poland for *R*
_{0} = 1.05, and between 83.5% and 88% for *R*
_{0} = 1.2, showing that there were small chances for the H1N1 pandemic outbreak to start in the summer in those countries, in agreement with the observed unfolding [53]. In general, a large dependence of the probability of extinction on the reproductive number is observed. Analogously to the behavior discussed for the epidemic size, also for the probability of extinction Belgium represents a special case for which *π*
_{
c
} is slightly larger than *π*
_{
a
}, again induced by the larger average number of contacts in the adult class, differently from all other countries for which the data is available. Finally, in the case of homogeneous mixing in a non-partitioned population, we recover the probability of extinction to be equal to *R*
_{0}
^{−1} (dashed line in the figure).

The quantities reported in panels A,B of Figure 3 represent the ingredients to assess the risk of a global epidemic as driven by the partition of the population into age classes and the non-homogeneous mixing pattern considered, as also discussed in Ref. [14]. The additional role of the non-homogeneous travel behavior is considered explicitly by modeling the invasion process through Eq. (5). To solve the system analytically, we simplify the recursive relation of Eq. (5) in the assumption of mild epidemics (i.e. in the limit of *R*
_{0} close to 1), introduced or emerged in the system through a localized seeding event (so that the number of infected subpopulations can be neglected at the early stage of the spatial invasion), and considering the case of a mobility network lacking topological correlations (in this approximation the conditional probability *P*(*k*|*k*
^{'}) can be simplified, see the Additional file 1). By manipulating the Equation (see the Additional file 1 for the full details of the calculation), we obtain a condition allowing the increase of the number of infected subpopulations and a global epidemic in the metapopulation system only if

{R}_{*}=\left[\left(1-{\pi}_{c}\right)r{z}_{c}+\left(1-{\pi}_{a}\right)\left(1-r\right){z}_{a}\right]\frac{{w}_{0}}{\mu}\frac{\u3008{k}^{2+2\theta}\u3009-\u3008{k}^{1+2\theta}\u3009}{\u3008k\u3009}>1

(7)

thus defining the global invasion threshold of the metapopulation system. Equation (7) defines the threshold condition for the global invasion: if *R*
_{*} assumes values larger than 1, the epidemic starting from a given subpopulation will reach global proportion affecting a finite fraction of the subpopulations of the system; if instead *R*
_{*} < 1, the epidemic will be contained at its source and will not spread further to other locations. The global invasion threshold is a complex function of the disease history parameters, and of the parameters describing the age-specific mixing patterns and travel behavior through *π*
_{
c
}, *π*
_{
a
}, *r*, *z*
_{
c
}, *z*
_{
a
}. Its dependence on the population spatial structure is embodied by travel fluxes and the topology of the mobility network, through *w*
_{0}, *θ*, and the degree moments 〈*k*
^{a}〉. However it is important to note that this indicator does not depend on the number of subpopulations *V* of the system, therefore it may be applicable to a variety of countries for which data is available, independently of their size. In the following we explore the dependence of *R*
_{*} on these multiple factors, examine their role in driving the pandemic extinction or invasion, and provide possible applications examples for a set of countries considering the 2009 H1N1 pandemic.

### Impact of air mobility

The term *w*
_{0}(*k*
^{2 + 2 θ} − *k*
^{1 + 2 θ})/*k* represents the contribution of the air mobility network to the global invasion threshold, with *w*
_{0} and *θ* regulating the travel fluxes, and the various moments of *k*, {k}^{m}={{\displaystyle \sum}}_{k}{k}^{m}P\left(k\right), expressing the dependence on the network structure as encoded in its degree distribution *P*(*k*) = *k*
^{− γ}. The large degree fluctuations found in real transportation systems and mobility patterns [15–19] constitute one of the mechanisms responsible for driving *R*
_{*} to considerable high values, even when small values of the reproductive number are considered. Figure 3C shows the dependence of *R*
_{*} on the reproductive number accounting for changes in the topological heterogeneity of the mobility network (*γ* = 2 and *γ* = 3) and in the traffic heterogeneity along the air connections (*θ* = 0.5 and *θ* = 0, the latter being the homogeneous traffic case). The larger degree fluctuations obtained in the case *γ* = 2 strongly increase the ratio (*k*
^{2 + 2 θ} − *k*
^{1 + 2 θ})/*k*, leading to values of *R*
_{*} ranging from ~10 for *R*
_{0} = 1.05 up to approximately 10^{3} for *R*
_{0} = 2, a value of the reproductive number consistent with the estimate for the 1918–1919 pandemic [54]. It is important to note that, besides the threshold condition *R*
_{*} > 1, the absolute value of the estimator *R*
_{*} provides a quantitative indication of the effective reduction that needs to be reached through public health interventions in order to bring *R*
_{*} below its threshold value, i.e. the difference *R*
_{*} − 1. The *R*
_{*} values for *γ* = 2 are roughly one order of magnitude greater than the values recovered in the case *γ* = 3, and in addition the condition for the epidemic invasion is more sensitive to variations in *R*
_{0} in the case of larger heterogeneities in the air mobility patterns (*γ* = 2 vs. *γ* = 3).

For both network topologies considered, the epidemic is above the threshold value of 1, and the outbreak is predicted to spread globally in the system for all diseases considered (*R*
_{0} ≥ 1.05), consistently with the H1N1 influenza virus invasion at the global level. The partition into classes, though lowering the epidemic sizes and increasing the probability of extinction [35, 55, 56], is not able to drive the system below the threshold of the global invasion for the range of values explored. In addition, the contribution of the topological heterogeneity of the mobility network (*k*
^{2 + 2 θ} − *k*
^{1 + 2 θ})/*k* in increasing the value of *R*
_{*} is so large that it cannot be easily counterbalanced by reductions of the mobility scale *w*
_{0} corresponding to interventions through air travel restrictions. This was already observed in numerical results obtained from data-driven modeling approaches and in analytical predictions based on simple homogeneous mixing among individuals within the subpopulation of the system [28, 46, 50, 57–60]. We will see in the following subsections how differences across countries may impact the conditions for invasion, and will quantitatively assess within this framework the role of travel reductions consistent with the traffic drop observed in the traffic to/from Mexico.

Epidemic containment is instead reached for *R*
_{0} < 1.2 when exploring homogeneous traffic flows (i.e. *θ* = 0 differently from the realistic scenario *θ* = 0.5) in the case *γ* = 3 (see Figure 3C), showing how the large variations observed in the traffic flows along air travel connections represent an additional element favoring the spatial spread of the pathogen. On the other hand, extensive measures aimed at radically altering the amount of passenger traveling or the distribution of traffic on the air connections can hardly be achieved in reality.

### Impact of the contacts ratio *η*

The global invasion threshold is a function of the extinction probabilities and of the epidemic sizes per age class for which an explicit solution cannot be obtained in the general case. An approximate solution can be recovered for small *ε* and in the two limit cases of the contacts ratio *η* = *q*
_{
a
}/*q*
_{
c
}: *η* → 0, i.e. a regime in which almost all contacts are established by children, and *η* → 1, i.e. a situation that is almost homogeneous in the distribution of the number of contacts per individual. The approximate solutions are reported in the Additional file 1.

Besides these two limit cases, we investigate in Figure 4 the behavior of *R*
_{*} as a function of the contacts ratio *η*, exploring the interval [0.25,1] to include the estimates from the Polymod data (range from 0.62 to 0.97, Belgium excluded as discussed before) and to satisfy the existence condition on *ε*. The value of *α* is fixed to the European average, *α* = 0.197, whereas each country is represented with its corresponding assortativity parameter *ε*, ranging from *ε* = 0.083 in Italy to *ε* = 0.125 in Belgium.

The global invasion threshold is predicted to increase with *η*, showing how a larger number of contacts established by adults, regardless of the across-groups mixing, would favor the spatial propagation. The variations observed among countries are induced by the country-specific assortativity profiles and decrease with *η*. Therefore, if *R*
_{*} reaches its critical level for relatively small values of *η*, for which variations across countries are still large, we may reach a heterogeneous outbreak situation in which some countries would experience spatial transmission, while others would be able to contain the outbreak simply due to the role of country-specific age profiles and of the level of assortativity in each population. This may be for example the case with *γ* = 3 and *R*
_{0} < 1.2 (Figure 4B). Results are consistent with the numerical evidence obtained from a data-driven agent-based model in Europe [61].

Countries characterized by particularly low values of *η*, for cultural, behavioral, and/or social reasons, would be at a lower risk of invasion. Therefore control measures aimed at reducing the contacts ratio *η* of a specific country may represent an effective policy option to consider. This could be achieved through the application of workplace interventions, including for instance working at home, reducing or avoiding work meetings, and shifting the working timing to reduce overlap at workplaces and at break hours, as well as crowding on transports [62].

Finally, we report on the effects of the inclusion of the latency period in the compartmental model. The results reported in the Additional file 1 uncover the dependence of the global invasion threshold on the generation time of the disease, i.e. the sum of the latency and infectious periods in the compartmental approximation considered. A simple addition of the latency period to the description of the disease would therefore increase the generation time and, consequently, the global threshold parameter *R*
_{*}, while keeping the qualitative picture unchanged (Additional file 1). Individuals would indeed have a longer time span available to travel and potentially spread the disease while carrying an infection.

### Impact of assortativity and age profile

We now examine the dependence of the critical condition for the global invasion on the assortativity level of the population partition, by plotting in Figure 5
*R*
_{*} as a function of the parameters *ε* and *η*, where we have set the children fraction *α* equal to the European average value, *α* = 0.197. *R*
_{*} is an increasing function of the across-groups mixing *ε*, indicating that a decrease in the assortativity level of the mixing pattern (corresponding to an increase of *ε*) favors the spatial invasion. If a larger fraction of contacts is indeed established between adults and children, the local transmission dynamics mainly driven by children is able to spread to a larger fraction of the adult population, thus increasing the chances for the spatial dissemination of the pathogen. Here we study a situation in which *r* = 0, i.e. only adults travel, in order to isolate the effect of changes in the local transmission while neglecting the role of children in the mobility process.

The rectangle shown in the panels indicates the ranges of the country values for *ε* and *η* in Europe. A variation of *η* from the smallest to the largest of the country values produces a stronger effect on *R*
_{*} with respect to a variation of *ε* within the European range. As such, *η* represents an important source of country heterogeneity in the epidemic outcome, as already discussed.

The two panels differ for the value of the reproductive number considered that takes into account the effective reduction in the transmission potential observed during school holidays (panel A, corresponding to *R*
_{0} = 1.05) with respect to school term (panel B, *R*
_{0} = 1.4). These values are estimated for the UK on the basis of contact data for the country in the two periods [36], as illustrated in detail in the Methods section, and here we generalize this result to all countries under study to provide a comparison between school opening and school closure in terms of the predicted risk of a major outbreak. During school holiday period (panel A), European countries are found to be close to the critical level, with a portion of the parameter space for Europe in the extinction region, thus confirming the results presented before regarding variations in the country-specific risks of major epidemics. These findings are compatible with the heterogeneous transmission of H1N1 influenza virus in summer 2009 that was not sustained across continental Europe [53] and identify the main mechanisms responsible of these effects in the interplay between demographic/mixing features and the seeding during school holidays (due to similar school calendars, with the exception of the UK). If instead we consider that the influenza pandemic arrived in the UK during school term, our predictions indicate that the risk of a major epidemic with community and spatial transmission was expected to be very high even during summer period (Figure 5B), as observed in the country. Other factors undoubtedly played a role in producing the different transmission across countries, and they include humidity conditions [53], and the timing and magnitude of the coupling of the European continent to Mexico and the United States through international travel [3], both factors being country-specific and not considered here.

Additional countries with similar age profile may be mapped onto the two-dimensional plots of Figure 5 to gather immediate analytical insight regarding the risk of a major epidemic for the pathogen under consideration, given data availability on the country-specific demographic profiles and mixing habits. This would provide valuable predictions on the conditions for spatio-temporal transmission and informed recommendations for effective control strategies at the start of an outbreak. As an example, we also explored the situation for the United States, considering synthetic information on individual contact networks built from activity surveys and simulations for the city of Portland [25]. By assuming the validity of this synthetic information for the whole country, we can compare the global invasion threshold in the US with the one studied in Europe from real country-specific data. Similar properties are found in the air transportation networks of the two regions, and the slight differences in the age partition (*α* = 0.24 in the US vs. *α* = 0.20 in Europe) do not lead to great discrepancies in the behavior of the global threshold as a function of the parameter *η* (Additional file 1).

If we assume that children travel according to the statistics obtained from travel data (i.e. *r* is set to 7%), the increase in *r* leads to an increase of *R*
_{*}, as expected, given that a fraction of the individuals driving the local outbreaks also represent potential seeds in new locations not yet affected by the epidemic (see Figure 6A). Such a small change in *r* may also be enough to drive the system from the extinction phase to the major epidemic phase, for certain values of the other parameters. For example, an epidemic starting in a subpopulation in Italy, where the across-groups mixing is equal to *ε* = 0.08, would reach pandemic proportion if a small fraction of children would travel by air, with respect to the case in which such fraction is neglected (see the vertical line in Figure 6A). Given the specific seeding role of children vs. adults, age-targeted entry screening of travelers at airports may be envisaged in an attempt to prevent spatial transmission. The mild and self-limiting nature of most influenza infections, in addition to the presence of asymptomatic infections, is however likely to prevent a perfect identification and notification of cases at entry ports. Given that even a small percentage of children traveling would considerably increase the risk of spatial transmission, leading only to small delays in the invasion process [63], such interventions should be evaluated with respect to their actual efficacy in specific epidemic emergencies and balanced against the resources required for their implementation.

Besides mixing patterns and travel statistics, countries also differ for their age profiles (considered in the model through the parameter *α*), an easy to access statistics for all countries in the world [21]. The country seed of the 2009 H1N1 pandemic, Mexico, has for instance a larger fraction of the population in the younger age class compared to the population of Europe, *α* = 0.32 vs. *α* = 0.20 (where the age classification used for Mexico is up to 15 years old instead of up to 18 years old as adopted in Europe, due to data availability) and is characterized by a larger number of contacts established by children with respect to adults (*η* = 0.32 vs. *η* = 0.79). Our predictions indicate that, for the same assortativity values, there exists a range of values of the epidemic transmission potential that is compatible with epidemic containment in Mexico, whereas Europe may experience spatial propagation of the disease (see Figure 6B for *R*
_{0} = 1.05). Remarkably, the increase of the fraction of children and the change in the contact ratio, with no change in the across-groups mixing pattern (i.e. same *ε*), is able to drive the system in the Mexican scenario to the extinction phase, reducing of 94% the global invasion threshold obtained for Europe if we consider a pathogen with a transmission potential compatible with the seasonal estimates of the H1N1 pandemic in Europe during summer 2009 [3]. Such results would be particularly important when considering the country where a new epidemic may emerge, as predictions obtained from previous works considering hypothetical seeding countries would be hardly applicable if demographic features are different. National preparedness plans may be informed by country-specific recommendations through the present approach, and an extensive exploration of the model’s results in terms of classes of demographic and contact pattern features would help in providing more general insights on classes of seeding scenarios.

In the case of *R*
_{0} = 1.4, i.e. the lower bound of the estimate of the reproductive number for Mexico based on the early outbreak of the 2009 H1N1 pandemic [2], our model predict that the country would be above the critical value, thus in agreement with the spatial spread that was observed.

### Effect of pre-existing immunity and travel reductions

If we consider pre-existing immunity in the population, calculating the fraction of individuals in the adult age class that corresponds to the estimated values from serological data available after the 2009 H1N1 pandemic [40–42], we find that immunity reduces the condition for the global invasion threshold, as we could expect since a fraction of the population is now modeled to be fully protected against the virus. Figure 7 addresses the comparison between the two cases, immunity and no-immunity, by showing the two corresponding critical curves *R*
_{*} = 1 in the *α*, *ε* plane for Europe and Mexico (panels A and B, respectively). The effect of pre-existing immunity in reducing the probability of a global epidemic spread is shown by the increase in the value of *ε* corresponding to the invasion condition *R*
_{*} = 1; a larger mixing across age classes is therefore needed for the pathogen to spatially propagate in case a fraction of the older age class is immune, whereas a more assortative population is predicted to be able to contain the emerging epidemic. The effect is very small for the case of Mexico, while for Europe it is more visible given that the considered European population is, on average, older than the Mexican one, and thus a larger fraction of the adult population is assumed to be immune in Europe with respect to Mexico (9.6% in Europe vs. 4.4% in Mexico). The points (*α*, *ε*) parameterized with the average European data and with Mexican data (corresponding dots in the Figure 7) both lie in the spatial invasion phase when we consider *R*
_{0} ≥ 1.2 in Europe and *R*
_{0} ≥ 1.4 in Mexico.

As an additional factor, we also consider the effect resulting from the application of travel reductions. We simulate the travel controls applied by some countries in addition to the self-reaction of the population avoiding travel to the affected area that was observed during the early stage of the 2009 H1N1 pandemic [39], by setting *w*
_{0} = 0.5, i.e. a uniform reduction along all travel connections, independently of the age of travelers. Such reductions reduce the phase space of parameters leading to global invasion, as expected, however they would not be able to lead to a containment of the disease once the model is fitted to the Mexican and European data and to the H1N1 pandemic scenario, confirming previous findings [28, 39, 46, 57–60].

### Limitations of the study

Our study is based on a multi-host stochastic metapopulation model that considers several simplifying assumptions that we discuss in this subsection.

The partition of the population into children and adult classes is clearly very schematic, especially if we consider that finer level classifications are available for demographic data at the global scale, and for contacts data, though in a very small set of countries that are the ones considered in this study. Similar considerations arise in the case of additional heterogeneities to be included in the model, as for instance differing patterns of transmission between households, schools, and workplace settings. A higher level of structuring of the population into classes is expected to decrease the epidemic sizes per comparable groups of classes, and to increase the probability of extinction of the epidemic, with respect to our predictions (see for instance the discussions in [35, 55, 56] and references therein). The inclusion of such features would prevent an analytical treatment of the model and therefore push the study towards an agent-based approach [61, 64, 65], for which numerical simulations would represent the only available methodology.

Another assumption concerns the exponentially distributed infectious period. More realistic descriptions of the infectious period – including constant, gamma-distributed or data-driven infectious periods – were found to alter the model results by reducing the probability of extinction. Such findings were however obtained in a single population model with homogeneous mixing [66] and in a two-population model coupled by mobility, where, on the other hand, individuals were allowed only one trip, i.e. to change only one time their subpopulation [67]. Such modeling approaches and corresponding results are not applicable to our case, and a systematic understanding of the impact of the infectious period distribution on the probability of extinction in a metapopulation model is still missing.

Our analytical approach also assumes that the importation or the emergence of an infectious disease is highly localized at the beginning of the outbreak, so that it is possible to approximate the spatial spreading process in terms of a branching process evolving in a set of subpopulations not yet affected by the disease. This is similar to the approximations used to calculate the basic reproductive number in a fully susceptible population, and it is required to treat the model analytically. While this assumption can generally be considered as a good approximation to describe the early phase of an outbreak, more complicated seeding events may occur that would require numerical approaches able to explicitly take into account the initial conditions and assess the epidemic risks.

Our model is fit to demographic statistics of a set of countries and it is informed with H1N1 epidemic estimates to provide quantitative information on the risk for the pandemic invasion in such countries. However, in all our predictions we assume the same population structure (*α*), contacts and mixing profiles (*ε*, *η*), and travel behavior (*r*) across all subpopulations of the metapopulation system, as informed by the data for a given country. We consider this approximation a reasonable one for regions characterized by population features that are quite uniform across space, for instance if we consider the subpopulations within a given country; already at the European level we noticed how small variations in demographic features and mixing patterns may be responsible for diverse outcomes regarding extinction or invasion, and additional layers of heterogeneities need to be considered when variations among populations within the system are larger (e.g. regarding travel behavior per age class, as shown in Figure 1B). This could be achieved by considering subpopulation-dependent variables {*α*
_{
i
}, *ε*
_{
i
}, *η*
_{
i
}, *r*
_{
i
}} (with *i* indicating the subpopulation), however preventing an analytical treatment to solve the system due to the additional complications considered, thus requiring the use of numerical approaches.

Recent work relying on large-scale transmission models has explored the ability of these approaches to predict the timing of spread of the 2009 A/H1N1 influenza pandemic around the world [68]. Differently from these numerical approaches that can describe the geotemporal propagation of the infectious disease in the population, our model does not provide any temporal information on the epidemic unfolding in that all dynamical aspects are synthetically summarized in a branching process leading to the condition for global invasion. On the other hand, given the relevance of demographic profiles, mixing patterns and age-specific travel resulting from the present study, it would be important to further extend large-scale spatial transmission computational models to include such features. While computationally feasible, the main limitation nowadays is represented by the availability of mixing data and travel behavior for a large set of countries, given a global level objective. This further supports the need to have multiple modeling frameworks that can complement each other in providing important information to characterize an emerging epidemic and its associated risks and impact.

Finally, we note that changes in time of population behavior as a response to the ongoing outbreak cannot be dynamically incorporated in the model. These may refer for example to changes in the contact patterns due to self-awareness or changes at the community level due to the implementation of intervention strategies to control the epidemic [49, 69–73]. On the other hand, these scenarios can be separately studied with the model, assuming each of these features to be constant in time, in order to assess the effect of such changes on the corresponding risk of a major epidemic. As possible application examples we provided model predictions for different mixing patterns related to school terms and school holidays during H1N1 pandemic, as well as uniform travel restrictions that may result from self-impositions, national guidelines or travel bans. Future studies will focus on other historical epidemics, like e.g. the case of the 2002–2003 SARS outbreak, in order to explore which mechanisms, among the ones included in this approach and related to the applied interventions or to individual’s self-adaptation, hindered a fully global transmission of the disease.