Skip to main content

Social distancing causally impacts the spread of SARS-CoV-2: a U.S. nationwide event study

Abstract

We assess the causal impact of social distancing on the spread of SARS-CoV-2 in the U.S. using the quasi-natural experimental setting created by the spontaneous relaxation of social distancing behavior brought on by the protests that erupted across the nation following George Floyd’s tragic death on May 25, 2020. Using a difference-in-difference specification and a balanced sample covering the [− 30, 30] day event window centered on the onset of protests, we document an increase of 1.34 cases per day, per 100,000 population, in the SARS-CoV-2 incidence rate in protest counties, relative to their propensity score matching non-protest counterparts. This represents a 26.8% increase in the incidence rate relative to the week preceding the protests. We find that the treatment effect only manifests itself after the onset of the protests and our placebo tests rule out the possibility that our findings are attributable to chance. Our research informs policy makers and provides insights regarding the usefulness of social distancing as an intervention to minimize the spread of SARS-CoV-2.

Peer Review reports

Introduction

The highly contagious novel coronavirus, severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), responsible for coronavirus disease 2019 (COVID-19), emerged in December 2019 in Wuhan city, Hubei province, China [1]. The initial COVID-19 outbreak quickly evolved into a pandemic [2], and as of June 2020, SARS-CoV-2 has reached over 180 countries and regions, with the total number of confirmed cases surpassing 10 million globally [3]. COVID-19 has spread throughout the United States (U.S.) at an unparalleled rate, infecting over 2.5 million individuals and claiming over 125,000 lives [4]. Global public health measures aimed at reducing the spread of SARS-CoV-2 have been designed in consideration of the virus’s specific transmission properties [5]. SARS-CoV-2 can be transmitted through various modes, including person-to-person contact and the spread of respiratory droplets, which can travel across a minimum distance of 6 feet (2 m) [6, 7]. Numerous countries have introduced social distancing, defined as the maintenance of at least a 6 foot interpersonal physical separation, to minimize direct transmission from infected individuals [8].

In the U.S., individual states have been granted the authority to design their own COVID-19 mitigation strategy, therefore, the extent and type of social distancing policies adopted differs across states [9]. Research examining state-imposed restrictions has found a reduction in the doubling rate of SARS-CoV-2 among U.S. states [10], as well as the daily growth rate of COVID-19 cases across U.S counties after the imposition of social distancing measures [11, 12]. Other research has suggested that rather than reducing the daily growth rate of COVID-19, social distancing merely stabilizes the spread of SARS-CoV-2 in the U.S [10]. Additionally, when examining the effectiveness of social distancing, studies have used social mobility as a measure of social distancing [11, 13,14,15]. However, mobility represents an imperfect proxy for social distancing because individuals can be mobile while still maintaining the required minimum 6 foot separation from others to prevent viral transmission. Furthermore, although evidence suggests there is an association between social distancing and the spread of SARS-CoV-2, the causal impact of social distancing on the spread of SARS-Cov-2 is still unknown.

In this study, we examine the causal impact of a spontaneous relaxation of social distancing measures on the spread of SARS-CoV-2. The nationwide mass protests precipitated by George Floyd’s tragic death on May 25, 2020 prompted an abrupt relaxation of social distancing behavior across the U.S [16]. The unpredictable nature of the protests created a natural experimental setting to assess for causality. In this study, instead of using mobility as a proxy for social distancing, we control for the increase in mobility during the protest period in order to hone in on the direct effect of social distancing. We also explicitly control for the concurrent relaxation of state-imposed restrictions to account for variations in social distancing restrictions across states.

Methods

This study uses publicly accessible data exclusively and all statistical methods employed herein comply with relevant guidelines and regulations.

Data and sample description

We source our U.S. COVID-19 data from the John Hopkins Whiting School of Engineering’s Center for Systems Science and Engineering’s GitHub repository [17]. This data consists of confirmed cases in each county at the end of every day since the start of the outbreak in late January 2020. We calculate the number of new cases for each county and each day by subtracting the cumulative number of confirmed cases at the end of the day from the number of cumulative cases from the previous day.

We obtain our county-level population data and our county-level demographic data from the U.S. Census Bureau [18]. We extract our county-level Gross Domestic Product (GDP) data from the U.S. Bureau of Economic Analysis’ (BEA) Regional Economic Accounts database (Table CAGDP1) [19]. We retrieve county-level data on the prevalence of obesity, diabetes, smoking, and hypertension from the University of Washington’s Institute for Health Metrics and Evaluation (IHME) [20]. The hypertension and obesity data are for the years 2009 and 2011, respectively, and the diabetes and smoking prevalence data are for 2012. The IHME reports hypertension and obesity data for females and males separately, so we construct a population-weighted average measure for these two covariates based on the proportion of females and males in each county, as reported by the U.S. Census Bureau.

The social distancing restrictions data is from the University of Washington’s State-Level Social Distancing Policies in Response to the 2019 Novel Coronavirus in the U.S. repository [21]. The social distancing restrictions include: (1) restrictions on public gatherings exceeding 5, 10, 25, 50, 100, 250, 500, or 1000 people, (2) limits on restaurant operations, (3) closure of specific businesses, e.g. fitness centres, gyms, casinos, etc., (4) closure of non-essential businesses, (5) stay-at-home orders for non-essential activities, (6) state curfews on non-essential activities, (7) mandated quarantines for people entering the state, (8) travel restrictions prohibiting residents from leaving the state, non-residents from entering the state, or residents from travelling across counties within the state, (9) self-isolation requirement for individuals with confirmed COVID-19 incidence, and (10) mandatory wearing of masks or other mouth and nose coverings in public places. We construct our social distancing restrictions index by adding the number of restrictions that are in place in a state on any given day, based on the date at which each restriction is enacted, relaxed, or expired.

We obtain our mobility data from the Descartes Labs [22]. This data consists of mobility indexes calculated at the end of every day and aggregated at the county level. The indexes, which we will refer to as the social mobility indexes, are based on geolocation reports from smartphones and other mobile devices, and track the movements of individual mobile phone subscribers. The methodology employed to construct these indexes is described in Warren et al. [23]. The mobility index data is available at a daily frequency from March 1, 2020, for 2669 counties.

Finally, we construct a comprehensive list of protests that took place across the U.S. based on the List of George Floyd protests in the United States assembled by Wikipedia [24]. At the time of writing, the main Wikipedia page cited 134 news articles from national, regional, and local media outlets, and the secondary pages cited hundreds more. From these media citations, we extracted the location and the date at which the protests reportedly took place, as well as the estimated number of individuals involved in each protest. We complement this process with a search on the Dow Jones Factiva database [25]. The onset of the protests among the counties in which protests took place, i.e. the treatment, is staggered across time and ranges between May 26, 2020, and June 7, 2020, so we center our experiment on the first protest date in each treated county, as opposed to the date of George Floyd’s death, May 24, 2020. Therefore, the George Floyd protests produce a quasi-natural experimental setting with staggered treatment dates, rather than a single treatment date setting.

Our sample period begins on March 1, 2020, when the social mobility data becomes available and ends on July 7, 2020. This ending date enables us to carry out our estimation on a balanced panel dataset consisting of a 30-day event window centered on the onset of the protests in each protest county. Our sampling procedure yields a panel dataset consisting of a total of 256,202 county-days representing 2617 (541 protest and 2076 non-protest) counties from all fifty states with incidence rate and covariates data available for our entire estimation window. From this dataset, we form covariate-balanced treatment and control groups using the propensity score matching technique described below and carry out our estimation of the treatment effect.

We report descriptive statistics for new and cumulative SARS-CoV-2 cases in Table 1, broken down by state, along with the total number of counties and the total number of county-days represented in our sample. In Table 2, we report the earliest and the latest ‘first protest’ date within each state’s counties, along with the size of the protest, according to media reports. We provide a map of the continental U.S. in Fig. 1, which reveals the geographic distribution of counties where protests took place along with the size of the first protest that took place within them. Figure 2 shows the evolution of our social distancing restrictions index for a selection of states. Figure 3 shows the social mobility index for a small and a large county in the states of New York and Alabama.

Table 1 Sample description
Table 2 List of U.S. protests
Fig. 1
figure 1

Counties involved in protests. This figure identifies the counties in which protests took place, according to media reports, along with the number of participants involved in the first protest that took place within each county. Counties within the states of Alaska and Hawaii are not shown, but they are included in our sample

Fig. 2
figure 2

Social distancing restrictions index. This figure shows the evolution of our social distancing restrictions index from March 1, 2020, to July 7, 2020, for the states of Alabama, California, Florida, and New York. The vertical line corresponds to May 26, 2020, the day of the protests’ onset

Fig. 3
figure 3

Social mobility index. This figure shows the evolution of the social mobility index from March 1, 2020, to July 7, 2020, for Tompkins and New York counties in the state of New York, and for Lauderdale and Jefferson counties in the state of Alabama

Regression specification

We examine the impact of the spontaneous relaxation of social distancing behavior that was brought on by the George Floyd protests across the U.S. on the SARS-CoV-2 incidence rate with an Ordinary Least Squares (OLS) staggered differences-in-differences (DID) panel regression equation, which is specified as follows:

$$IR_{i,j,t} = \alpha + \beta_{1} Post_{FPi,j,t} + {{\mathbf{X}}_{i,j,t}^{\prime}}\delta_{C1}+{{\mathbf{Y}}_{j,t}^{\prime}}\delta_{C2}+\gamma_{i}+ \eta_{t} +\epsilon_{i,j,t},$$
(1)

where \(IR_{i,j,t}\), the incidence rate, corresponds to the number of new SARS-CoV-2 infections in county i from state j on day t, per 100,000 population. \(Post_{FPi,j,t}\) is an indicator variable that is set equal to one on the day where protests begin in county i, as well as every day thereafter, and to zero otherwise. This indicator variable is set to zero on each day t for non-protest counties included in our control group. \({\mathbf {X}}_{i,j,t}\) and \({\mathbf {Y}}_{j,t}\) are vectors of county-level and state-level characteristics, which we use as control variables. \(\gamma _{i}\) captures time-invariant state fixed effects, and \(\eta _{t}\) represents time (day) fixed effects to control for changes in the aggregate SARS-CoV-2 incidence rate and common trends between our treatment and control group counties over time.

In Eq. (1), \(\alpha\) is a constant term and \(\beta _{1}\) captures the impact of the relaxation of social distancing brought on by the protests on the SARS-CoV-2 incidence rate. Hence, \(\beta _{1}\) is the parameter of interest in this regression. Under the null hypothesis that the relaxation of social distancing behavior has no causal impact on the SARS-CoV-2 incidence rate, \(\beta _{1}\) should be statistically indistinguishable from zero. We cluster the standard errors at the county level to account for any potential cross-sectional and time-serial dependence in the error terms, \(\epsilon _{i,j,t}\) [26, 27]. We perform our statistical analysis with STATA 16 and use and use the REGHDFE command to estimate Eq. (1) [28].

Covariates

We include county-level control variables that may influence the incidence rate of SARS-CoV-2 in our staggered DID regression specification. These control variables account for demographic, health, living proximity, and income level variations across counties. For demographic controls, we include sex (Male) and age (60 years+) since these factors are associated with both an increased risk of testing positive for SARS-CoV-2 and greater illness severity [29]. We also include ethnicity, i.e., Asian, Black, Hispanic, and White, as demographic control variable, to account for the increased risk of a positive SARS-COV-2 test observed among certain ethnicities, especially Blacks and Hispanics. Our demographic variables are expressed in decimals, and represent the fraction of a county’s total population that falls in a particular group, based on the U.S. Census Bureau’s county-level population statistics for 2018. We include Diabetes prevalence, Hypertension prevalence, Obesity prevalence, and Smoking prevalence as health control variables. Obesity, diabetes, and hypertension are clinical risk factors that are associated with an increased risk of severe illness, and a greater risk of mortality from COVID-19 [30]. Smoking is also a clinical risk factor, as some evidence suggests that smoking may be associated with an increased severity of COVID-19 [31]. We include the natural logarithm of population density, ln(Population density), among our control variables, as higher incidence rates of SARS-CoV-2 are observed in more densely populated, urban, areas [30, 32]. Finally, consistent with previous research showing that residents from more economically deprived areas are more likely to test positive for SARS-COV-2, we use the natural logarighm of real GDP per capita, ln(Per capita RGDP), to control for income in our regressions [30].

In the period preceding the onset of the protests, the number of new COVID-19 cases began to drop steadily across the country [3]. Accordingly, several states began to relax their social distancing restrictions in a carefully staged manner. Figure 2 illustrates this trend in Alabama, California, Florida, and New York, for instance. Starting in mid-March, we observe a steady rise in our social distancing restrictions index in these four states and we observe the start of a slow unwind by mid-April. Notably, while social distancing restrictions were being relaxed across the nation, social mobility was also on the rise (see Fig. 3). The concurrent relaxation of social distancing restrictions and the increase in social mobility around the onset of the protests may very well have contributed to an increase in the SARS-CoV-2 incidence rate during the event period that is unrelated to the protests, so we include our social distancing restrictions and social mobility indexes in our DID regression equation (1), as additional control variables.

Propensity score matching

The first panel of Table 3 reveals statistically significant differences between protest and non-protest counties included in our sample on just about every dimension represented by the covariates introduced in the previous sub-section, barring the proportion of blacks included in the two groups. Non-protest counties have a significantly higher proportion of males, whites, 60-years+, are less healthy and wealthy, live in less densely populated areas, and are significantly more socially mobile than their counterparts from protest counties. These differences between the two groups may introduce selection bias into our experiment. This is a common concern with observational studies, such as the present one, where the subjects are not randomly assigned to the treatment and control groups by the researcher [33]. To ensure that our control group is as similar as possible to our treatment group from the perspective of all these covariates, i.e., to mimimize any potential selection bias in our experiment, we form our treatment and control groups using the propensity score matching technique [34]. In the context of our experiment, the propensity score represents the estimated likelihood that a county will experience an increase in its SARS-CoV-2 infection rate.

Table 3 Summary statistics for covariates

Essentially, the matching process begins with a logistic regression in which the dependent variable is set to one for the 541 protest (treated) counties included in our sample, and to zero for the remaining 2077 non-protest (untreated) counties. The independent variables included in this regression correspond to our covariates, all of which have been shown to influence the likelihood of contracting SARS-CoV-2. Next, we match treated counties to their nearest neighbour from the untreated group, without replacement, with standard caliper of 0.25 standard deviations, based on the propensity scores from the logistic regression [35, 36]. This process yields a balanced sample consisting of 356 treated and 356 untreated counties. As Table 3 shows, from the perspective of our covariates, these two groups do not exhibit any statistically significant differences from each other, with the exception of Hypertension prevalence, which is significantly higher in our treatment group than in our control group, albeit at the 5% level.

Our quasi-natural experimental setting satisfies at least two key requirements for the identification of the causal link between social distancing and the spread of SARS-CoV-2, namely: (1) the existence of a strong theoretical basis supporting the relationship in question and, (2) exogenous variation in the variable of interest, i.e social distancing [37]. The presence of an exogenous shock in our setting, i.e., protests arising spontaneously in some counties as a result of a tragic event, is key to establish causality, as this mitigates concerns that omitted variables correlated with both the protests and the spread of SARS-CoV-2 might be driving our findings. This setting also minimizes concerns about endogeneity and self-selection, which beset most non-randomized-trial experiments.

In sum, thanks to the covariate balance that we are able to achieve with our propensity score matching process, our staggered DID regression specification is uniquely well positioned to separate the impact of the relaxation of social distancing behaviour on the SARS-CoV-2 incidence rate from other factors that may potentially affect the spread of the disease. Next, to address any potential concerns that our findings may be contaminated by confounding events, we exclude from our regression the county-days that fall outside of the [− 30, + 30]-day event window centered on the day when protests begin in a protest county [33, 38].

Results

Impact of protests on SARS-CoV-2 incidence

We report results from regression equation (1) in Table 4. The coefficient of interest in this regression is \(\beta _1\), which is associated with \(Post_{FP}\), our post-protest indicator variable. This coefficient is positive and highly statistically significant (1.34; 95% CI 0.21–2.47), implying that the SARS-CoV-2 incidence rate increases by 1.34 cases per day, per 100,000, on average, following the onset of the protests in protest counties, relative to their propensity score matching non-protest counterparts. To put this finding into perspective, recall that the average number of new cases across all counties is equal to 5 per day, per 100,000 population, in the week preceding the onset of the protests (see Column (2) of Table 1). Using this number as a reference point, this finding suggests that the SARS-CoV-2 incidence rate increases by 1.34/5 = 26.8% following the onset of the protests, due to the relaxation of social distancing brought on by the protests.

Table 4 Impact of protests on SARS-CoV-2 infections

Even if our observed covariates are well-balanced, one still needs to assess whether the parallel trends assumption underpinning the DID design is satisfied. We assess whether pre-treatment trends for our treatment and control groups are parallel by estimating a “leads and lags model” [39]. In this model, we replace our \(Post_{FP}\) indicator variable in Eq. (1) with a family of period-specific indicator variables spanning the pre- and post-protest event window. Each indicator variable is set equal to one for treated counties for a specific 5-day period surrounding the onset of the protests, and to zero otherwise. Under the null hypothesis that pre-treatment trends are parallel, the coefficients associated with the pre-treatment indicator variables should not exhibit any pattern and should be statistically insignificant. Meanwhile, the coefficients associated with the post-treatment indicator variables will reveal the treatment effect as it manifests itself in the data during the post-protest period.

Figure 4 plots the value of the coefficients associated with our pre- and post-protest indicator variables. In this figure, p corresponds to the five-day period starting on the protest date and ending 4 days later, i.e., [0, 4], + 1p is for days [5, 9], and -1p is for days [5, 1]. We don’t observe any clear trend in the pre-treatment periods and none of the coefficients are statistically different from zero, suggesting that the parallel trends assumption is satisfied. Post-protest, we observe a clear upward trend in the magnitude of the coefficients, which is reversed in period + 4p. The treatment effect becomes statistically different from zero in period + 2p, roughly ten days following the onset of the protests. This is consistent with SARS-CoV-2’s incubation period and typical testing wait times. Finally, we note the attenuation of the treatment effect in period + 4p. This is to be expected, as the impact of the relaxation of social distancing brought on by the protests must eventually die out. In sum, the treatment effect documented in Table 4 unfolds over time in a manner that supports the hypothesis that social distancing causally impacts the spread of SARS-CoV-2.

Fig. 4
figure 4

Timing of the protests’ impact on the SARS-CoV-2 incidence rate. Each bar provides the point estimate of the difference between the SARS-CoV-2 incidence rate in protest counties relative to their propensity score matched non-protest counterparts, for 5-day periods around the onset of the protests. For instance, p corresponds to the period starting on the day of the protests and ending four days later, i.e., days [0, 4], + 1p is for day [5, 9], and − 1p is for days [− 5, − 1]. The 95% confidence band is superimposed on each point estimate

Placebo test

We conduct a placebo test to assess whether the causal impact of the protests on the spread of SARS-CoV-2 that we document in Table 4 can be attributed to chance. For this purpose, we implement a Monte Carlo simulation exercise centered on our staggered DID panel regression specification, i.e., Eq. (1). In each iteration of this simulation, we assign 541 counties randomly to the potential treatment group and the remaining 2077 counties to the potential control group. We then implement our propensity score matching process to create a balanced sample of treated and control counties. We perform this matching process without replacement with the 0.25 standard deviation caliper, as per “Propensity score matching” section. Next, we assign a [− 30, + 30]-day event period to each treated county randomly with start dates ranging between March 1, 2020, and May 8, 2020. Then, we align each control group county’s timeline to its treated counterpart’s event timeline and create the \(Post_{FPi,j,t}\) indicator variable. Once this step has been completed, we estimate our staggered DID regression specification on the simulated sample and collect the \(\beta _{1}\) coefficient estimate, along with its county-cluster robust t-statistic. We implement this process 5000 times to produce the simulated distribution of \(\beta _{1}\) coefficients and associated statistics. If the \(\beta _{1}\) estimate from Table 4 lies above the 95% threshold from the distribution of simulated \(\beta _{1}\) coefficient estimates, we can conclude with a high level of confidence that the treatment effect that we document in this paper cannot be attributed to chance.

We present results from this placebo test in Table 5. The 95% and 99% threshold values for the \(\beta _{1}\) coefficient from the simulated distribution are equal to 0.57 and 1.42, respectively, while our empirical estimate in Table 4 is equal to 1.34. Likewise, the 95% and 99% threshold values for the robust t-statistics from the simulated distribution are equal to 0.44 and 1.01, respectively, while the robust t-statistic associated with our \(\beta _{1}\) coefficient estimate in Table 4 is equal to 2.32. Since our \(\beta _{1}\) estimate and its associated robust t-statistic are well beyond their respective 95% simulated threshold values, we can safely reject the null hypothesis that relaxing social distancing behavior has no impact on the spread of SARS-CoV-2 and, with a high degree of confidence, we can rule out the possibility that the treatment effect that we document in Table 4 is attributable to chance.

Table 5 Placebo tests

Discussion

In this paper, we exploit the quasi-natural experimental setting created by the spontaneous relaxation of social distancing brought on by the protests that erupted across the U.S. following George Floyd’s tragic death on May 25, 2020, to the assess the causal impact of social distancing on the spread of SARS-CoV-2 in the U.S. Using a staggered difference-in-difference specification and a balanced sample covering the [− 30, + 30]-day event window centered on the onset of the protests, we document an increase of 1.34 cases per day, per 100,000 population, in the SARS-CoV-2 incidence rate in protest counties, relative to their propensity score matching non-protest counterparts. This represents a 26.8% increase in the incidence rate relative to the week preceding the onset of the protests.

Strengths and weaknesses

Early predictive models assessing the effectiveness of social distancing have suggested that a greater spread of SARS-CoV-2 would occur in the absence of social distancing measures [40,41,42]. Similarly, our study demonstrates that when social distancing is reduced, i.e., by individuals protesting in close proximity, the spread of SARS-CoV-2 increases. Our study differs from its predecessors because instead of examining the effectiveness of social distancing measures following their imposition [11, 12, 14], we examine the impact of social distancing on the spread of COVID-19 when social distancing behavior is abruptly relaxed. Additionally, unlike previous studies, we do not use mobility as a measure of social distancing, instead we use social mobility as a control variable in our analyses. By explicitly controlling for the concurrent increase in social mobility and the relaxation of state-imposed social distancing restrictions during the period surrounding the protests, our study demonstrates that social distancing directly impacts the spread of SARS-CoV-2. We also control for a host of covariates known to influence the transmission of SARS-CoV-2, and implement placebo tests to rule out the possibility that our results are attributable to chance. Therefore, we can be confident that the increase in SARS-CoV-2 incidence that we observe following the onset of the protests can be attributed to the relaxation of social distancing behavior.

Our study is not without limitations. In particular, over 70 testing centers across the U.S. were closed following the onset of the protests. We are also unable to assess protest participants’ vulnerability (e.g. age, underlying health conditions, personal protective wear, etc.), and variability along these dimensions may influence the risk of SARS-CoV-2 incidence. Additionally, we cannot control for the actual degree of physical proximity between participants, which would impact the transmission rate of SARS-CoV-2 during the protests. We are also unable to control for any potential under-reporting of COVID-19 cases over time and across counties [43]. This would be a concern if protest counties and non-protest counties were impacted differently by this phenomenon. Moreover, we rely on the accuracy of media reports to identify the counties in which protests took place. Finally, we do not account for the magnitude of the protests in each county, however, expressing the case counts in rates rather than in levels should minimize any potential scale-related effects.

Future research and implications

Future research using this experimental setting could use machine learning tools to analyze protest videos and determine the relative contribution of participant demographics, the degree of physical distancing, and the extent and type of personal protective wear on the spread of SARS-CoV-2. Social mobility data might also be used to track the extent to which people who participated in protests visited a SARS-CoV-2 testing centres at any point before or after they partook in protests. Taken together, this study demonstrates that, when controlling for social mobility restrictions, social mobility, and a host of other potential risk factors for the contraction of SARS-CoV-2, the relaxation of social distancing behavior causally impacts the spread of SARS-CoV-2. As states are in the midst of relaxing the social distancing restrictions initially imposed in March 2020, establishing the effectiveness of social distancing behavior in a statistically reliable way has important public health implications. Our research informs policy makers and provides insights regarding the usefulness of social distancing as an intervention to minimize the spread of SARS-CoV-2, and reduce the risk of a second, and possibly, third wave of COVID-19.

Availability of data and materials

All the studies cited in this paper are peer-reviewed journal articles or preprints and can be accessed in the public domain. All datasets utilized to conduct this experiment (U.S. covid-19 data, county-level demographic data, social distancing restrictions, and the list of protests data) are accessible publicly and links to these sources are provided in the list of references. The Stata dataset that was constructed for this study is available from the authors upon request.

References

  1. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395(10223):497–506.

    Article  CAS  Google Scholar 

  2. Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed. 2020;91(1):157–60.

    PubMed  PubMed Central  Google Scholar 

  3. University JH. COVID-19 map—Johns Hopkins coronavirus resource center. Johns Hopkins Coronavirus Resource Center. 2020. https://coronavirus.jhu.edu/map.html.

  4. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533–4.

    Article  CAS  Google Scholar 

  5. Chu DK, Akl EA, Duda S, Solo K, Yaacoub S, Schünemann HJ, et al. Physical distancing, face masks, and eye protection to prevent person-to-person transmission of SARS-CoV-2 and COVID-19: a systematic review and meta-analysis. Lancet. 2020;395(10242):1973–87.

    Article  CAS  Google Scholar 

  6. Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2). Science. 2020;368(6490):489–93.

    Article  CAS  Google Scholar 

  7. Setti L, Passarini F, De Gennaro G, Barbieri P, Perrone MG, Borelli M, et al. Airborne transmission route of covid-19: why 2 meters/6 feet of inter-personal distance could not be enough. Int J Environ Res Public Health. 2020;17(8):2932.

    Article  CAS  Google Scholar 

  8. Center for Disease Control and Prevention. How coronavirus spreads. Coronavirus disease 2019 (COVID-19). 2020. https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/how-covid-spreads.html.

  9. Adolph C, Amano K, Bang-Jensen B, Fullman N, Wilkerson J. Pandemic politics: timing state-level social distancing responses to COVID-19. medRxiv. 2020. https://doi.org/10.1215/03616878-8802162.

    Article  Google Scholar 

  10. Wagner AB, Hill EL, Ryan SE, Sun Z, Deng G, Bhadane S, et al. Social distancing has merely stabilized COVID-19 in the US. medRxiv. 2020. https://doi.org/10.1101/2020.04.27.20081836.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Abouk R, Heydari B. The immediate effect of COVID-19 policies on social distancing behavior in the United States. SSRN Electron J. 2020.

  12. Courtemanche C, Garuccio J, Le A, Pinkston J, Yelowitz A. Strong social distancing measures in the United States reduced the COVID-19 growth rate. Health Aff. 2020;39(7):1237–46.

    Article  Google Scholar 

  13. Maloney W, Taskin T. Determinants of social distancing and economic activity during COVID-19 a global view. SSRN Electron J. 2020. https://ssrn.com/abstract=3599572.

  14. Delen D, Eryarsoy E, Davazdahemami B. No place like home: a cross-national assessment of the efficacy of social distancing during the COVID-19 pandemic (Preprint). JMIR Public Health Surveill. 2020;6:1–10.

    Article  Google Scholar 

  15. Carroll C, Bhattacharjee S, Chen Y, Dubey P, Fan J, Gajardo A, et al. Time dynamics of COVID-19. medRxiv. 2020. p. 2020.05.21.20109405.

  16. Taylor DB. George floyd protests: a timeline. 2021. https://www.nytimes.com/article/george-floyd-protests-timeline.html.

  17. COVID19. John Hopkins whiting school of engineering. 2020. https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series.

  18. US demographic data. United States Census Bureau. 2020. https://www.census.gov.

  19. Regional economic accounts: download. Bureau of Economic Analysis. 2020. https://apps.bea.gov/regional/downloadzip.cfm.

  20. Global health data exchange. University of Washington. 2020. https://ghdx.healthdata.org.

  21. Social distancing. Covid state policy. 2020. https://github.com/COVID19StatePolicy/SocialDistancing.

  22. Data for mobility changes in response to COVID-19. Descartes Lab. 2020. https://github.com/descarteslabs/DL-COVID-19.

  23. Warren MS, Skillman SW. Mobility changes in response to COVID-19. http://arxiv.org/abs/2003.14228.

  24. List of George Floyd protests in the United States. Wikipedia. 2020. https://en.wikipedia.org/wiki/List_of_George_Floyd_protests_in_the_United_States.

  25. List of George Floyd protests in the United States. Factiva. 2020. https://www.dowjones.com/professional/factiva/.

  26. Petersen MA. Estimating standard errors in finance panel data sets: comparing approaches. Rev Financial Stud. 2009;22(1):435–80. https://doi.org/10.1093/rfs/hhn053.

    Article  Google Scholar 

  27. Hoechle D. Robust standard errors for panel regressions with cross-sectional dependence. Stata J. 2007;7(3):281–312.

    Article  Google Scholar 

  28. Correia S. REGHDFE: Stata module to perform linear or instrumental-variable regression absorbing any number of high-dimensional fixed effects. Statistical Software Components, Boston College Department of Economics. 2014. https://ideas.repec.org/c/boc/bocode/s457874.html.

  29. Parohan M, Yaghoubi S, Seraji A, Javanbakht MH, Sarraf P, Djalali M. Risk factors for mortality in patients with Coronavirus disease 2019 (COVID-19) infection: a systematic review and meta-analysis of observational studies. Aging Male. 2020;23:1–9.

    Article  Google Scholar 

  30. de Lusignan S, Dorward J, Correa A, Jones N, Akinyemi O, Amirthalingam G, et al. Risk factors for SARS-CoV-2 among patients in the Oxford Royal College of General Practitioners Research and Surveillance Centre primary care network: a cross-sectional study. Lancet Infect Dis. 2020;20(9):1034–42.

    Article  Google Scholar 

  31. Vardavas CI, Nikitara K. COVID-19 and smoking: a systematic review of the evidence. Tob Induc Dis. 2020. https://doi.org/10.18332/tid/119324.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Sy KTL, White LF, Nichols BE. Population density and basic reproductive number of COVID-19 across United States counties. medRxiv. 2020. p. 2020.06.12.20130021.

  33. Atanasov V, Black B. Shock-based causal inference in corporate finance and accounting research. Critic Finance Re. 2016;5:207–304.

    Article  Google Scholar 

  34. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.

    Article  Google Scholar 

  35. Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat. 1985;39(1):33–8.

    Google Scholar 

  36. Lunt M. Selecting an appropriate caliper can be essential for achieving good balance with propensity score matching. Am J Epidemiol. 1983;179(2):226–35.

    Article  Google Scholar 

  37. Kahn R, Whited TM. Identification is not causality, and vice versa. Rev Corp Finance Stud. 2018;7(1):1–21.

    Article  Google Scholar 

  38. Fauver L, Hung M, Li X, Taboada AG. Board reforms and firm value: worldwide evidence. J Financial Econ. 2017;125:120–42.

    Article  Google Scholar 

  39. Autor D. Outsourcing at will: the contribution of unjust dismissal doctrine to the growth of employment outsourcing. J Labour Econ. 2003;21:1–42.

    Article  Google Scholar 

  40. Matrajt L, Leung T. Evaluating the effectiveness of social distancing interventions against COVID-19. medRxiv. 2020. p. 2020.03.27.20044891.

  41. Friston KJ, Parr T, Zeidman P, Razi A, Flandin G, Daunizeau J, et al. Second waves, social distancing, and the spread of COVID-19 across America. http://arxiv.org/abs/2004.13017.

  42. Anderson RM, Heesterbeek H, Klinkenberg D, Hollingsworth TD. How will country-based mitigation measures influence the course of the COVID-19 epidemic? Lancet. 2020;395(10228):931–4.

    Article  CAS  Google Scholar 

  43. Albani V, Loria J, Massad E, Zubelli J. COVID-19 underreporting and its impact on vaccination strategies. BMC Infect Dis. 2021. https://doi.org/10.1186/s12879-021-06780-7.pdf.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We wish to express our sincere thanks to the Descartes Labs for making their mobility data available to us.

Funding

LG acknowledges the financial support from the Smith School of Business Distinguished Faculty Fellowship at Queen’s University.

Author information

Authors and Affiliations

Authors

Contributions

LG conceived the study, and all authors contributed to the final study design. LG performed the data analysis, created the tables and figures, and wrote the methods and results sections in the initial draft of the manuscript. SG and JL conducted the literature search, and assisted LG with the data collection. All authors contributed substantially to the interpretation of the data, and equally to the write up. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Louis Gagnon.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interests and confirm that they have read BMC’s guidance on competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gagnon, L., Gagnon, S. & Lloyd, J. Social distancing causally impacts the spread of SARS-CoV-2: a U.S. nationwide event study. BMC Infect Dis 22, 787 (2022). https://doi.org/10.1186/s12879-022-07763-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12879-022-07763-y

Keywords