Different epidemiological models [12] capture different aspects of disease spread and many of these models are based on coupled ordinary differential equations (ODEs). In the susceptible-infected-recovered (SIR) model [6], the rate of change of S(t), the number of susceptible individuals at time t, is described by the ODE
$$\begin{aligned} {\dot{S}}(t)= -\beta S(t)I(t)/N. \end{aligned}$$
(1)
Here, I(t) and N denote the number of infectious individuals at time t and the population size, respectively. The infection rate is \(\beta\).
We now assume that the epidemic state of a population can be represented by some quantity y(t) and that its evolution (i.e., the rate of change) is described by a function g(y(t), t). That is,
$$\begin{aligned} {\dot{y}}(t)= g(y(t),t). \end{aligned}$$
(2)
The SIR model (1) can be written in terms of Eq. (2) by setting \(y(t)=(S(t), I(t), R(t))^\top\).
Euler’s method [9] is one of the simplest numerical procedures for solving ordinary differential equations of the form (2) for a given initial condition. This method uses a timestep \(\Delta t>0\) to approximate the solution of Eq. (2) at times \(t_1, t_2, \dots , t_n\) according to9, 10
$$\begin{aligned} y(t_{n+1})= y(t_{n}) + \Delta t {\dot{y}}(t_n). \end{aligned}$$
(3)
However, in reality the functional form \(g(\cdot )\) that describes the rate of change of y(t) that is relevant for infectious disease surveillance is usually not known. In the following paragraphs, we thus describe practical ways how to estimate COVID-19 fatalities y(t) and their local rate of change \({\dot{y}}(t)\) from noisy observation data.
We first collected data on CDC ensemble forecasts between June 2020 and June 2021 [1].Footnote 1 Ensemble forecasts are available for cumulative and weekly incidence and fatality numbers and a forecasting horizon between 1 to 4 weeks. All forecasts use data from the Johns Hopkins Coronavirus Resource Center [11] as ground truth. Forecasts are made for epidemiological weeks which run Sunday through Saturday. As an example, if forecasts with 1- and 4-week forecasting horizons are being made on June 8, 2020 the corresponding forecasting intervals are June 7–June 13, 2020 and June 7–July 4, 2020 [13].
We compare CDC and ECDC ensemble forecasts of COVID-19 fatalities with a simple and easily interpretable forecasting method. To do so, let y(t) be the incidence of COVID-19 fatalities at time t. We use \({\dot{y}}(t)\) to denote the rate of change of y(t) at time t. Forecasting the incidence \(y(t+\Delta t)\) at a target time \(t+\Delta t\) requires us to find an estimate of this quantity at an earlier time t. A straightforward way to construct short-term forecasts is to (i) use the current rate of change \({\dot{y}}(t)\) and (ii) determine a forecast at time \(t_k=t_0+k\Delta t\) according to the Euler method [9, 10]
$$\begin{aligned} y(t_0+k\Delta t)= \underbrace{y(t_0)}_{\text {last incidence}} + \underbrace{k\Delta t\,{\dot{y}}(t_0)}_{\text {incidence correction}}, \end{aligned}$$
(4)
where \(\Delta t\) and \(k=1,2,\dots\) represent a time step (e.g., 1 week) and the number of time steps in the forecasting horizon, respectively. However, observed incidences are subject to observation noise that results from confounding factors including sampling bias, measurement errors, and reporting delays [14].
A possible way to “de-noise” observed data is to use weekly incidences instead of daily incidence levels. If observational noise can be reduced by averaging over a period of several days, daily errors are less pronounced on a weekly level. However, the local daily derivative is quite sensitive to noise and our incidence correction term may not help in making accurate short-term forecasts. Therefore, we can impose some degree of regularity to reduce the level of noise with the following minimization
$$\begin{aligned} {{\,\mathrm{arg\,min}\,}}_{\{w_k\}} \sum _k (y_k-w_k)^2 + \uplambda \sum _k (w_k-w_{k-1})^2, \end{aligned}$$
(5)
where \(y_k=y(t_0+k\Delta t)\), \(w_k=w(t_0+k \Delta t)\) is a regularized approximation of \(y_k\), and \(\uplambda\) is a regularization parameter. In the limit \(\uplambda \rightarrow 0\), the argument of Eq. (5) is minimized if w(t) approaches y(t). In the limit \(\uplambda \rightarrow \infty\), the argument of Eq. (5) is minimized if w(t) is constant (i.e., if \(w_k-w_{k-1}=0\)). This optimization process has its equivalent Euler–Lagrange formulation for numerical differentiation [15, 16]. Values of \(\uplambda \in (0,\infty )\) yield functions w(t) that are smoothed versions of y(t) with respect to the discrete rate of change \(w_k-w_{k-1}\). Finally, the regularized Euler short-term forecastFootnote 2 is given by
$$\begin{aligned} y(t+k\Delta t)= y(t) + k\,[w(t)-w(t-\Delta t)]. \end{aligned}$$
(6)
In the following section, we use both the standard Euler method and the regularized Euler method to generate forecasts of reported COVID-19 fatalities.
Our source codes are publicly available at [17].