Forecast COVID-19 cases in the US under different levels of social distancing

## Our Research

## Other COVID Prediction Models

## Summary of Methodologies

We measure the extent to which social distancing reduces the speed at which COVID-19 spreads. We then run simulations to forecast the rates of COVID-19 spread under different social distancing levels. We find that COVID-19 spreads less than proportionately with the number of contagious individuals, a distinct difference from the assumption of standard models. We also observe that social distancing greatly reduces the spread of COVID-19.

The model we estimate is a modified version of a susceptible-infected-recovered (SIR) model:

*y*(1)

_{i,t}= R_{i,t}S_{i,t}(Y_{i,t-2}– Y_{t-8})^{ω}

where *y _{i,t}* is the number of new infections in county

*i*on day

*t*,

*R*is the rate at which infectious individuals transmit the disease,

_{i,t}*S*is the percentage of the county population that has not yet had COVID-19, and

_{i,t}*Y*is the number of cumulative individuals who have been infected by day

_{i,t}*t*.

The most crucial difference between this model and standard SIR models is that standard SIR models constrain *ω*=1. However, a model with this constraint does not perform well out of sample. We instead find that *ω*=0.57. As we note in our paper, such a result would be expected if contagious individuals expose many of the same unexposed individuals, which could occur if cases are clustered within households, nursing homes, or places of work. This concavity implies that while cases may initially grow exponentially either at the beginning of the pandemic, or during times of easing social restrictions, the number of new cases will fairly quickly get to a flat level, where it will be relatively steady, declining slowly over time (unless a further intervention occurs).

We allow *R _{i,t}* to vary with a number of factors instead of
treating it as a constant parameter:

*R*(2)

_{i,t}= exp(α_{i}+ β_{t}+ λd_{i,t}+ θh_{i,t}+ μm_{i,t}+ ε_{i,t})

This specification allows transmission rates to differ across counties (county fixed effects *α _{i}* reflect different population densities and demographics), dates (date fixed effects

*β*accommodate different rates of testing and different rates of reporting that happen on weekdays vs. weekends), levels of social distancing

_{t}*d*, and different temperatures and humidity,

_{i,t}*h*and

_{i,t}*m*, respectively. The social distancing measure is based on cellphone GPS location data that are provided by SafeGraph for free to researchers studying COVID-19.

_{i,t}We estimate equation (1) by taking the logarithm of both sides, and then subtracting ln(*S _{i,t}*) from both sides. We add 1 to the number of new cases to ensure that the left-hand side is well-defined.

Observed social distancing levels and social distancing regulations are not determined in a vacuum: Rather, people social distance more in areas that are hit harder by COVID-19. Thus, *ε _{i,t}* may be correlated with social distancing, causing a biased measurement of the impact of social distancing on the rate of contagion. We thus use an instrumental variables (IV) technique to control for this endogeneity bias, where the amount of rain is our instrument for social distancing. The first stage F-test for the strength of rain as an instrument is 214.44, which is highly significant, indicating that rain is a strong instrument.