Our findings illustrate the time-varying nature of sociodemographic and economic predictors of COVID-19 case incidence at the community level in Massachusetts during an eight-month period. These observations suggest that fixed assumptions regarding community-level COVID-19 vulnerability, such as increased risk among Black communities or continued elevated relative risk among the elderly, may not accurately represent the pandemic at every point in time, and that these associations should be continually reassessed for relevance alongside shifting mitigation efforts, policies, and individual behavior. While other studies have identified racial/ethnic disparities in COVID-19 incidence in Massachusetts and the extent to which these patterns are explained by social factors [7], our study provides novel insight about how both patterns and predictors change over time. Likewise, associations that remain significant over time with consistent coefficients in these adjusted models, such as workers in essential services, may suggest that existing approaches to reduce risk within these subgroups are less effective, perhaps due to structural challenges in reducing certain exposures.
We note that the community-level predictor variables used here serve as proxies for underlying factors that influence individual viral exposure risk. This distinction is important in our analysis, as it remains possible that analyses using individual-level predictors would lead to different findings. However, our analysis provides insight into sociodemographic patterns of COVID-19 and subpopulations who may be at elevated risk, informing community-scale public health interventions, and vaccination strategies by location, as well as a useful and adaptable structure to assess risk alongside public data.
The consistently elevated risk observed in communities with increased Latinx populations (in models adjusted for essential workers and other sociodemographic variables) may underscore challenges in reaching these communities with successful interventions, or barriers to reduced exposure in these communities. The sustained elevated risk of Latinx populations throughout the first 8 months of the pandemic is consistent with recent studies [7, 33]. In our models, Latinx population was positively correlated with greater housing density, suggesting that this ethnicity variable may serve in part as a proxy for crowded housing (> 1 person/room) in our models. Within-household transmission is an established risk factor for COVID-19 transmission [34, 35], and it is possible that our findings here reflect established challenges in reducing transmission within housing environments where residents are unable to meaningfully distance from an infectious individual.
We did not observe a correlation between town-level housing density and Black populations, or between this race variable and other sociodemographic covariates in our model. This finding suggests that other factors beyond the scope of our analysis may be responsible for the elevated COVID-19 risks faced by towns with higher percentage of Black residents, especially early in the pandemic. Our findings parallel core conclusions of Figueroa et al., who evaluated cross-sectional COVID-19 case incidence alongside demographic data in Massachusetts from March–May 2020 [7]. The authors noted racial disparity in disease incidence in the early wave after adjustment for essential workers, immigration, and household size, while the association between COVID-19 cases and Latinx population was attenuated in models adjusted for these factors. Together, our studies support the hypotheses that systemic racism and inequities not otherwise captured in core demographic datasets play a role in driving COVID-19 racial inequities and that distinct analyses focused on systemic inequities are needed to fully understand the specific risks faced by Black individuals and communities.
Reduced risk of COVID-19 in communities with higher percentages of Black residents, adjusting for other factors, may suggest that the clear racial disparity observed in early months of the pandemic has diminished over time in Massachusetts [5, 10, 36, 37]. Our finding parallels other observations as to the reduced racial disparity in COVID-19 cases over time [38]. Likewise, substantial reductions in the association between long-term care beds and percentage of town population over 80 years and COVID-19 incidence may reflect success in interventions to protect the elderly, especially those living in long-term or nursing care facilities, after the initial, devastating impacts on this population early in the pandemic [39, 40]. It remains possible that factors not included in our analysis, notably biases associated with testing availability or other unstudied correlates, are responsible for the reductions in risk we observe here, especially by race. Analysis of more severe outcomes, including hospitalization and deaths, would help inform these questions, although requiring less-nuanced temporal resolution, and should be prioritized for future work.
The persistent positive association between essential workers and COVID-19 may reflect the continued vulnerability of this workforce to viral infection, despite workplace controls and personal mitigation behaviors, such as masking and maintaining social distance. Essential workers remained at the highest risk of all subgroups studied in our models throughout the pandemic, highlighting key challenges in protecting these populations, even many months into the pandemic. Interestingly, however, a covariate representing people spending more than 3 h in a location other than their home during office hours showed a null association with COVID-19 cases. The non-significant effect of worker mobility on case incidence may reflect limitations in the underlying data. SafeGraph data includes only 10% of cellphones in the US [41], although the data are highly correlated with true Census population [42]. However, it could also indicate that changes in worker mobility (e.g. “return to work” efforts) were not associated with increased cases when these workers were not part of the essential workforce. This would reinforce that communities with more non-essential workers face distinctly lower risk than those with more essential workers, even with resumption of economic activities, highlighting inequities in exposure profiles on the job.
The significant association between the percentage of town residents without health insurance and case incidence in the spring and summer periods may suggest growing case incidence among immigrants during those times. Due to a state health insurance mandate, the number of non-insured individuals in Massachusetts is low (2.8% of the total state population), but some municipalities have uninsured rates up to 25%, such as Chelsea, Everett, and Lawrence. These communities have higher proportions of younger adults, non-US citizens, and those who are less educated than the insured population [43]. Targeted interventions that focus on communities with elevated percentages of uninsured persons are important in fully understanding disease dynamics in the state.
As noted, our study is limited by the use of town-level covariates, which reduces our ability to draw causal inferences on the individual level; nonetheless, our insights remain relevant for targeted public health strategies. While the use of static predictors from the ACS and other databases allowed us to evaluate and interpret changes in coefficient magnitude and significance over time, it gave us limited ability to capture time-varying exposure factors (with the exception of SafeGraph data). While most of these predictors are expected to be fairly stable during the study period, notably demographic data, it is possible that real-time changes in these covariates may have altered our findings.
It is possible that disparities in testing availability, as noted earlier, contributed to observed trends in case incidence if communities with greater racial diversity had reduced access to case identification, and this is an important area for additional work. It is also possible that time lags in case reporting may have resulted in misclassification in case data by time period, or that incorrect or incomplete address information misidentified town of residence of cases. While we suspect that misclassification on the basis of wrongly attributed case information was largely due to human error and therefore nondifferential, it remains possible that bias affected our estimates in ways that are difficult to predict. In addition, our empirical findings may not generalize to later time periods when vaccines became widely available, although our statistical approach could be directly applied to these time periods and could yield insight regarding how vaccination patterns influenced the sociodemographic predictors of COVID-19 cases.
Our analyses demonstrate that the relevance and magnitude of community-level risk factors for COVID-19 are alterable, suggesting the relative effectiveness of intervention and mitigation efforts by population subgroups (or, conversely, factors that contribute to elevated risk varying over time). Our study highlights the need for local jurisdictions to use up-to-date data on vulnerable and high-risk populations to direct COVID-19 interventions, including vaccinations, rather than data from early in the pandemic. Our models were derived entirely from publicly available data and could be rapidly refit to newer data. Community-level analyses can help characterize social inequities embedded in the pandemic and track the evolution of these inequities with time, highlighting successes as well as disproportionate burdens experienced by vulnerable populations.