Setting
Suffolk County is a large cousupplnty (2362 square-kilometers [km2]) of approximately 1.5-million people that predominantly acts as an exterior suburban community serving New York City. The median age is 41.8 years; 66.6% are non-Hispanic White, 20.2% are Hispanic, 8.8% are Black, while the remainder predominantly reports being Asian or having two or more races. The median household income in Suffolk County is 54.6% higher than the national average. Overall, 6.8% of households fall below the national poverty line and 5.2% report lacking health insurance. Suffolk County is relatively densely populated with 645.6 people/km2.
Measures
To examine the potential for exterior exposure risk, we modeled COVID-19 incidence using cases reported to the Suffolk County Department of Health from March 16th, when data first began being recorded reliably using an electronic interface, until December 31st, 2020. At that time, Suffolk County was enduring a second wave. Daily case counts were shared with Stony Brook University to support the COVID-19 modeling efforts at the local level. After cleaning, county-level data were published online to a publicly-accessible database (the Additional file 1 provides cleaned county-level data merged with other variables used in this study). We limited the analysis to dates following March 16th, 2020, with the opening of multiple drive-through testing sites throughout the area and the establishment of regular case-reporting routines. Susceptible population estimates integrate overall county residential estimates derived from the U.S. census and were updated for daily death counts, and for the reported number of COVID-19-related disease counts.
Since daily case counts exhibit temporal dependence that is primarily determined by the unobserved community force of infection, in secondary analyses we examined an alternative outcome measure of relative change in daily case counts compared to an 8-day forward/backward autoregressive moving average [17], as defined by:
$$\frac{\left(cases\left(t\right)- \frac{1}{8}\sum_{k=t-4}^{t+4}cases\left(k\right)\right)}{\left(\frac{1}{8}\sum_{k=t-4}^{t+4}cases\left(k\right)\right)} \forall k\ne t$$
The 8-day forward/backward moving average, when integrated into the model, serves as a proxy measure of underlying force of infection. This allows us to partially capture the variability in absolute case counts that is due to “natural” transmission patterns rather than external shocks such as wind speed. It is important to note that, on average, this measure would be zero when case counts remain relatively constant over time, however, this measure will track the periods of exponential rise (where it will be positive) and decay (where it will be negative) of an epidemic’s waves. It is therefore important to take these distinct behaviors into account.
Maximal daily temperature, as well as average wind speed, were derived from the U.S. National Oceanic and Atmospheric Administration data portal (w2.weather.gov). Data were recorded at a central location at the MacArthur Airport in Islip, N.Y. Total snowfall and rainfall were recorded in inches and converted to centimeters. While temperatures 16–28 °C are likely to be protective, reduced wind speed impact on these days may emerge because individuals are more likely to be socializing outdoors where risk is markedly lower. In the summer, higher wind speed increases airflow and may reduce risk versus in the winter when it may work to push outside social contacts to shelter in indoor spaces. When exterior temperatures are warm enough (16–28 °C) to allow for outdoor social contacts to occur comfortably, we anticipated that increased wind speed would reduce overall transmission risk. In contrast, on days where exterior temperatures were cooler, increased wind speed might cause individuals to retreat indoors for social occasions.
Covariates
We adjusted for the number of days since lockdown (March 16th, 2020) and days since reopening began (May 15th, 2020) in Suffolk County, N.Y. To account for differences in daily reporting patterns, we incorporated a categorical variable indicating the day of the week that cases were reported. Noting that there have been significant spread following holidays, we incorporated an indicator of holidays that also incorporated the most significant weekend nearby. We also included covariates measuring rainfall and snowfall because they may correlate with wind speed as well as social activities outdoors. In the primary analysis, we also adjusted for the 8-day forward/backward moving average daily case count.
Statistical modeling
Descriptive characteristics include time-related trends in maximal temperature, average daily wind speed, and daily case counts. Daily and smoothed trends in maximal temperature and in average wind speed were reported.
In the main analysis, the incidence of COVID-19 positive caseload was reported as case counts per day so multivariable-adjusted modeling relied on negative binomial regression [18]. Negative binomial regression was chosen over alternatives including Poisson because we were concerned about the potential for over-dispersion in the outcome [19] since the infectious disease caseload is highly variable and because COVID-19 appears to spread commonly through super-spreading clusters [20]. A nine-day lag between exposure and case registration was assumed, consistent with epidemiological estimates of the incubation period for COVID-19 [21, 22] coupled with a two-day testing and one-day reporting lag period that has been common in Suffolk County since testing became widely available. Unadjusted and multivariable-adjusted incidence rate ratios (IRR) and 95% confidence intervals (95% C.I.) were reported. The interval between infection and disease ascertainment is unobserved and varies geographically by local testing availability and reporting systems: it can be reduced in places where testing is easy to find and lengthened in places where testing is difficult or requires hospitalization. As such, we conduct a sensitivity analysis considering the range of values of time intervals between exposure and case reporting. For our lagging period, we allowed four days because our experience suggests that it takes two days to report testing results to the Department of Health, and an additional day to report those results publicly. Fifteen days was selected as a ceiling for index case analysis to reduce the risk of sequential outcomes from prior case/exposure cycles consistent with prior publications [23]. However, in sensitivity analyses we report results for a 4–13-day range to clarify the impact of those choices. We used the log-likelihood to compare model fit for different lags.
We analyzed the secondary outcome – a relative measure of daily case counts calculated as ln(incident cases/population * 100,000) – using linear regression with the same set of covariates as the primary outcome measure and exploring the results for a range of reporting lags.
Since we theorized that there is heterogeneity in association between wind speed and COVID-19 transmission may depending on temperature, cutoffs for “warm” days and for days when wind speed was sufficiently fast were determined by comparing Akaike’s information criterion (AIC) across multiple models using different details as modeled parameters. We compared AIC between models to determine that 16 °C (60 °F) was an optimal lower bound in temperature, while follow-up analyses revealed an upper bound of 28 °C (84°F). To account for seasonality, we also adjusted for the maximal daily temperature. Because cutoffs may be useful when adjudicating risk at the local level, we used AIC to identify optimal cutoffs for wind speed. This resulted in identifying low wind speed to be < 8.85 KPH (kilometers per hour (KPH), equivalent to approximately 5.5 miles per hour).
Since the relative measure of daily case counts only partially adjusts for the community force of infection and underlying “natural” epidemic dynamics, we also conducted additional stratified sensitivity analyses cut into periods when case counts were relatively flat (06/07/2020–11/03/2020) and when the epidemic was exponentially increasing (03/16/2020–04/10/2020 and 11/04/2020–12/31/2020) or decaying (04/11/2020–06/06/2020). We used two criteria: daily temperature and epidemic dynamics pattern (flat versus rising/falling) to determine subsets for stratified analyses. Analyses were completed using Stata 16/MP [StataCorp].