Relationship of socio-demographics, comorbidities, symptoms and healthcare access with early COVID-19 presentation and disease severity

Background COVID-19 studies are primarily from the inpatient setting, skewing towards severe disease. Race and comorbidities predict hospitalization, however, ambulatory presentation of milder COVID-19 disease and characteristics associated with progression to severe disease is not well-understood. Methods We conducted a retrospective chart review including all COVID-19 positive cases from Stanford Health Care (SHC) in March 2020 to assess demographics, comorbidities and symptoms in relationship to: 1) their access point of testing (outpatient, inpatient, and emergency room (ER)) and 2) development of severe disease. Results Two hundred fifty-seven patients tested positive: 127 (49%), 96 (37%), and 34 (13%) at outpatient, ER and inpatient, respectively. Overall, 61% were age < 55; age > 75 was rarer in outpatient setting (11%) than ER (14%) or inpatient (24%). Most patients presented with cough (86%), fever/chills (76%), or fatigue (63%). 65% of inpatients reported shortness of breath compared to 30–32% of outpatients and ER patients. Ethnic/minority patients had a significantly higher risk of developing severe disease (Asian OR = 4.8 [1.6–14.2], Hispanic OR = 3.6 [1.1–11.9]). Medicare-insured patients were marginally more likely (OR = 4.0 [0.9–17.8]). Other factors associated with developing severe disease included kidney disease (OR = 6.1 [1.0–38.1]), cardiovascular disease (OR = 4.7 [1.0–22.1], shortness of breath (OR = 5.4 [2.3–12.6]) and GI symptoms (OR = 3.3 [1.4–7.7]; hypertension without concomitant CVD or kidney disease was marginally significant (OR = 2.3 [0.8–6.5]). Conclusions Early widespread symptomatic testing for COVID-19 in Silicon Valley included many less severely ill patients. Thorough manual review of symptomatology reconfirms the heterogeneity of COVID-19 symptoms, and challenges in using clinical characteristics to predict decline. We re-demonstrate that socio-demographics are consistently associated with severity.


Background
During the week of February 23, 2020, community spread of the virus that causes COVID-19 was reported in California. On March 7 the Centers for Disease Control (CDC) reported 213 confirmed cases in the US [1]. Early data on COVID-19 symptoms is primarily from the inpatient setting and therefore skews towards the more severely ill, leaving many knowledge gaps around characteristics of the disease in the ambulatory population.
Health professionals have struggled to identify who is most at risk of severe disease, as COVID-19's variable and wide-ranging symptoms make symptom-based diagnosis and prediction of progression to severe disease a challenge [2]. Instead, as community spread of COVID-19 has exponentially increased and testing has scaled up, race and ethnicity have consistently been found to predict hospitalization and mortality [1,3,4]. The complexities underlying these health disparities are uncertain, and undoubtedly include a mix of social, economic, access, and behavioral factors [3]. Moreover, certain comorbidities (including diabetes, heart disease, chronic kidney disease, and obesity) are now known to strongly predict COVID-19 hospitalization [5]; most of these comorbidities have disparate prevalence by race/ethnicity, but are not alone sufficient to explain racial/ethnic disparities in COVID-19's impacts.
Amidst this complex interplay of factors, the ambulatory presentation of milder COVID-19 disease, and which characteristics might predict progression to severe disease, remains poorly understood. Widespread testing and clinical documentation of ambulatory cases is necessary to fill in our understanding of the spectrum of COVID-19 disease. In March 2020, Stanford University in Santa Clara County was the first California healthcare system to institute drive-through testing available to anyone in the outpatient setting with COVID-19 symptoms, regardless of their insurance status [6].
Stanford testing guidelines in March 2020 were relatively broad and symptom based, allowing testing of anyone with new-onset fever, cough, sore throat, or shortness of breath or flu-like symptoms in the preceding 14 days. By comparison, the CDC guidelines at the time limited testing to those with lower respiratory infection, travel to known high risk regions, and known or suspected close contact with COVID-19+ individuals. As a result of its criteria, in its first month of testing, Stanford included a broad population of less severely ill patients than most prior reported cohorts. Moreover, the tested cohort presents unique socio-demographic diversity, as Silicon Valley has both wide disparities in income and unique racial/ethnic diversity-Santa Clara and San Mateo counties' have high population representation of Asian (39 and 30% respectively) and Hispanic/Latinx (36 and 24%), but relative under-representation of African American/Black (< 3%) [7,8].
We conducted a detailed chart review of COVID-19 positive patients presenting in the first month of widescale symptomatic testing to explore the patterns of sociodemographic, co-morbid conditions, and symptomatology to further our understanding of the disease.

Setting
Stanford Health Care (SHC) is an academic health system in Silicon Valley. It began testing for COVID-19 on March 4, 2020 using a reverse-transcriptase polymerase chain reaction (RT-PCR) diagnostic test developed by Stanford's clinical virology laboratory. The diagnostic test identifies the presence of viral RNA from nasopharyngeal swabs of potentially infected people, with an analytic sensitivity of 1 × 10-2 TCID50/mL and an analytic specificity of 100% [9,10]. This retrospective chart review study was approved by Stanford's Institutional Review Board (IRB 55757).

Patient population and data
We first identified all positive COVID-19 RT-PCR's performed by Stanford laboratory, from March 4 to March 31, 2020. These represented a mix of patients seen at an SHC clinic/facility and those seen at external hospital systems that were using Stanford as reference laboratory. Date and location of testing, age at specimen collection, gender, race/ethnicity, insurance plan, and county of residence were electronically extracted from Stanford's electronic medical record system. Only age and gender were available for patients tested at an external hospital; these patients were only included to assess how this broader group of patients compared to the more welldefined Covid patient population who sought testing at Stanford with respect to age and gender as a way to inform regional generalizability of the patients for whom we have more in-depth data.
We conducted a chart review on the subset whose tests were collected at SHC facilities, specifically, Stanford Hospital, Stanford ER, or one of ten Stanford primary care outpatient clinics including Stanford Express Clinic; patients whose test specimens were sent from a non-Stanford facility were excluded from the chart review ( Fig. 1). Abstracted data included: potential source of exposure to COVID-19; symptoms; medical history; hospitalization; ICU admission; and death. Hospitalizations, ICU admission, and deaths were accessed through April 29, 2020. The chart review data were independently abstracted by two medical students using double data entry; a faculty physician then re-reviewed all charts and reconciled any inconsistencies.

Data analysis
We calculated descriptive statistics overall and by location of presentation for all COVID-19 positive patients, including both those captured in external specimens sent from non-Stanford facilities as well as for the subset included in chart review, i.e. patients seen at a SHC clinic/facility. χ 2 test, Fisher's exact test, or Monte Carlo estimate for the Fisher's exact test were used to compare differences between patients that presented at outpatient, ER, and inpatient.
We conducted logistic regression analyses on the chart review subset to assess factors associated with the odds of developing severe disease, defined as being hospitalized and/or death. Patients with insufficient data available to determine if they were hospitalized or died (i.e., no subsequent contact confirming recovery 10 days after positive test), were excluded from these analyses. In descriptive statistics and logistic regression models, we defined hypertension (HTN) to be in the absence of other cardiovascular disease (CVD) or stage 3+ chronic kidney disease (CKD), defined as pre-existing documentation of GFR < 60, in order to assess whether common HTN in the absence of more severe and often related complications/comorbidities, was associated with developing severe COVID-19.
Two regression models were fit separately examining associations with: 1) patient socio-demographics and disease severity; and 2) comorbidities and symptoms with disease severity. We combined statistically significant variables from (1) and (2) to form a semi-final model. Because there was strong clinical interest regarding the estimates for the remaining comorbidities and symptoms, we added each into the semi-final model individually to evaluate their associations; our goal with these analyses was to build upon the growing early literature to better describe patients most likely to develop severe disease. The final mode included all variables in the semi-final model plus any additional variables that were significant when added individually to it. Variables were not included in any of the models if > 10% of the data were missing (i.e, BMI and smoking) or they prevented model convergence (i.e., loss of taste and back pain). Confidence intervals excluding 1.0 are considered statistically significant in logistic regression models. Analyses were performed using SAS (version 9.4; SAS Institute, Inc., Cary, NC, USA).

All COVID-19+ patients
Eight hundred forty-two local patients tested positive for COVID-19 in March 2020 through Stanford Health Care's laboratory; over half of which (54%) were specimens sent from non-Stanford facilities (Fig. 1). There were significant differences in the demographics of  (Table 1) for all characteristics examined. The outpatient care setting had the highest proportion of women, Caucasians and commercially insured patients. Outpatients skewed toward younger patient with only 5% 75 years or older. In contrast, the inpatient setting had the highest proportion of men, age over 65 years, and Medicareinsured patients. The ER setting had the highest proportion of younger (43% less than 40 years), Hispanic (32%) and Medicaid-insured patients (23%). The external patients most closely resembled the age and sex distribution of the inpatients, suggesting these were hospitalized patients with more severe disease at time of diagnosis.
Regardless of point of access, the vast majority (72%) of patients lived in one of the two counties closest to the main Stanford campus, Santa Clara or San Mateo; ER patients were most likely to be from these local counties (85% compared to 69 and 62% of outpatients and inpatients, respectively).

Chart review subset
There were 257 patients in the chart review analysis: 127 (49%) outpatient; 34 (13%) inpatient; and 96 (37%) ER (Fig. 1). The demographic characteristics of the chart review subset ( Table 2) were similar to that of the larger cohort (Table 1). Approximately 41% of patients had Percentages calculated based on n = 389; the external referral group was omitted because the vast majority of their data was missing for these variables b Percentages may not add to 100 due to rounding * P ≤ 0.05 ** P ≤ 0.01 *** P ≤ 0.001 contact with someone they knew to have Covid-19, 25% had traveled recently and 6% lived in a group living situation ( Table 2). The inpatient setting, however, had the highest proportion living in a communal setting (18%, compared to 3 and 6% tested in the outpatient and ER settings, respectively (p < 0.05)). The ER testing setting had the highest proportion of patients with known contacts (46%) and the outpatient testing setting had the highest proportion of recent travelers (31%), however, neither of these differed significantly by testing access point.
The most common comorbidities in all three groups were BMI ≥ 30, lung conditions, HTN, and ever smoking. Frequency of comorbidities varied little between those tested in the outpatient and ER settings aside from diabetes (8% outpatient vs 18% ER). In contrast, inpatients had the highest proportion of every comorbidity examined, except for gastrointestinal (GI) conditions ( Table 3). The largest difference was proportion with stage 3+ CKD as defined by history of GFR < 60, 21% of those tested in the inpatient setting but only 2% in the ER and outpatient settings. All differences were statistically significant (p < 0.05) except lung conditions, immunosuppressive conditions, neurologic conditions, and GI conditions (Table 3). Table 4 shows the frequency of symptoms overall and by testing access point. The most common symptoms in all patients were cough (86%), fever/chills (76%), and Includes shelter, dormitory, skilled nursing facility, assisted living facility, and homeless * P ≤ 0.05 ** P ≤ 0.01 *** P ≤ 0.001 fatigue (63%). There was more variation in frequency of other symptoms across patient groups. Specifically, outpatients had the highest proportion of any of the other symptoms except pleuritic chest pain, shortness of breath, GI symptoms, and back pain. ER patients were most likely to present with pleuritic chest pain (35% vs 24%; not significant), whereas inpatients were most likely to present with shortness of breath (65% vs 30-32%; p < 0.001) or GI symptoms (47% vs 22-24%; p < 0.05). Loss of taste and back pain/ache were  Includes nausea, vomiting, diarrhea * P ≤ 0.05 ** P ≤ 0.01 *** P ≤ 0.001 the least common presenting symptoms investigated, 9 and 3%, respectively.

Severe disease
A total of 41 patients experienced severe disease, defined as hospitalization or death. One outpatient died at home and four others were later hospitalized. Eight ER patients were later hospitalized, one of whom died. Two of the patients initially tested as inpatients died for a total of four deaths in the cohort. Table 5 shows the results of the logistic regression models. Five patients originally tested in the ER and two from the outpatient setting were lost to follow-up so were excluded from this analysis, the remaining 250 patients were included. The final model output indicates gender, race/ethnicity, and insurance were associated (or marginally associated) with developing severe disease after adjusting for comorbidities and symptoms. Specifically, male relative to female (OR = 2.2; 95% CI: 1.0, 4.7), Asian or Hispanic relative to Caucasian (OR = 4.8; 95% CI: 1.6, 14.2 and OR = 3.6; 95% CI: 1.1, 11.9, respectively), and Medicare insurance relative to commercial insurance (OR = 4.0; 95% CI: 0.9, 17.8) were the patient groups with statistically significant or marginally significant associations. Notably, age was not significantly associated after adjusting for insurance status, however, equally important is that 20% of patients aged 65+ had commercial insurance, thus older patients without commercial insurance had the highest risk of severe disease.

Discussion
This study examined the first cohort of ambulatory COVID-19 positive patients in the Silicon Valley region of California-one of the first communities in the United States to scale up community testing in response to early community spread. In exploring patient demographics, comorbidities and symptoms in relationship to their location of presentation and the development of severe disease, we reiterate the growing evidence of sociodemographic disparities in COVID-19's impacts. Our work was motivated by desire to identify clinical predictors of progression to more severe disease, but instead we find that race/ethnicity and insurance predict risk of hospitalization at the same or similar order of magnitude as the most predictive comorbidities and symptoms.
Thus, our data once again re-tells a story of racial inequity in health outcomes, but with specific local flavor. Silicon Valley's diversity includes large representation of Asian and Latinx populations, and while these groups were slightly under-represented compared to the local population, they still each represented a fifth of our study population. Notably, we had too few African American patients in our cohort to meaningfully analyze.
Our findings are both consistent and distinct with similar work looking at the broader San Francisco Bay area, done by the Sutter Health Care system--both studies demonstrate racial disparity in COVID-19, but highlight different affected minority communities [4]. Whereas theirs and prior work demonstrated 3-fold risk of hospitalization for African American COVID-19 patients, our cohort demonstrates a significant approximately 3-fold odds of hospitalization in Asian and Latinx patients. In particular, we believe this marked increased risk of hospitalization in Asian patients is a novel finding, made possible by the relatively large representation in our local population.
The fact that socio-demographic factorsincluding race/ethnicity and insurance type/statuswere associated with severity likely reflects the confluence of multiple underlying disparities including social, economic, access, and behavioral factors [3,4]. These might manifest as barriers to timely presentation to care and influence where patients eventually access care. This hypothesis is supported by our observation that the location of presentation (outpatient, inpatient, or emergency room), was most strikingly different by insurance status and race/ethnicity.
Latinx patients with COVID-19 were most likely to present to ER or inpatient settings. Patients with commercial insurance were most likely to present at an outpatient location while patients without insurance or with Medicaid were most likely to have their COVID-19+ status captured in the ER or once inpatient. Many factors might contribute to these differences, including familiarity with ways to rapidly access outpatient appointments, having a primary care physician, language and technical barriers to scheduling, cultural norms, perceptions of insurance requirements in different locations, and gaps in communication and knowledge of Stanford's broadened access offered (i.e., accepting all-payers and uninsured) for outpatient COVID-19 testing.
There were notably fewer differences amongst location of presentation in terms of presenting symptom and comorbidities; our study provides important descriptions of both, but also reiterates the diversity of symptoms that make clinical prediction a challenge for this disease. The two symptoms identified were shortness of breath/ dyspnea and GI symptoms, which unsurprisingly (as respiratory distress is frequent trigger for hospital admission as is dehydration from severe vomiting and/or diarrhea) was more common in hospitalized patients (65 and 47%, respectively) than outpatient (30 and 24%, respectively) or emergency room (32 and 22%, respectively) presentations. In our cohort, the comorbidities we observed to predict severe outcomes were consistent with the CDC's recent update from July 13, 2020, and we re-confirm a strong association with underlying CVD (OR 4.7) and chronic renal disease (OR 6.1) [5]. In the same update, the CDC lists hypertension as having "mixed evidence" for severe disease. Our data, similarly found marginal evidence for hypertension alone as independent predictor of worse outcomes with odds ratio 2.3. We purposefully defined our hypertension variable to capture the more common cases of hypertension, uncomplicated by CVD or CKD. This is important, given hypertension is one of the most prevalent conditions in the US, affecting one-third of adults [11]. Our findings from our cohort of predominantly ambulatory COVID-19 patients supports parallel findings seen in the predominantly inpatient data from meta-analysis, and together reinforce that HTN alone may be an independent risk factor for developing severe disease [12]. Our lack of evidence of increased risk with asthma, neurologic conditions, or diabetes should more likely be interpreted to reflect our relatively small sample size, rather than evidence against their plausible association with severe COVID-19.

Limitations and strengths
Despite broad testing criteria and early ramp-up of testing in our local system, the cohort included in our onemonth chart-review was of relatively small sample size, which was particularly limiting in our ability to examine comorbidities with somewhat lower prevalence (e.g., neurologic conditions) and data with high missingness (e.g. BMI and tobacco use). Further, the limitations on testing availability in March led to a selection bias. In our clinical experience, when testing was limited, many younger and healthier patients were assumed positive and not tested. Outpatient testing was prioritized for older persons, which may bias some of our estimates.
Strengths of our study include our methodology of rigorous manual chart review, which allowed for comprehensive identification of both comorbidities and symptoms. The accuracy and completeness of these factors is beyond prior COVID-19 studies which has relied on diagnostic codes or used natural language processing to estimate COVID-19 symptomatology [2,4]. Our race and ethnicity was data had relatively low rate of missingness, an issue that has been increasingly identified as a barrier to understanding the true extent of disparities in COVID-19 [13,14]. The study also had high follow-up for outcomes of disease severity at 97% of all patients.

Conclusions
When and how care is accessed and the outcomes for COVID-19 severe disease is affected by ethnicity and insurance type. We reiterate the disproportionate impact of COVID-19 on minority populations and specifically find that in a largely ambulatory population, in a region with large Asian and Latinx representation, that both of these race/ethnicity groups were associated with more severe cases of COVID-19. We also find further marginal evidence to support the to-date uncertain association of hypertension (independent of renal or more severe cardiovascular disease) with more severe COVID-19 disease [5].