Skip to main content
  • Research article
  • Open access
  • Published:

Relationship of socio-demographics, comorbidities, symptoms and healthcare access with early COVID-19 presentation and disease severity



COVID-19 studies are primarily from the inpatient setting, skewing towards severe disease. Race and comorbidities predict hospitalization, however, ambulatory presentation of milder COVID-19 disease and characteristics associated with progression to severe disease is not well-understood.


We conducted a retrospective chart review including all COVID-19 positive cases from Stanford Health Care (SHC) in March 2020 to assess demographics, comorbidities and symptoms in relationship to: 1) their access point of testing (outpatient, inpatient, and emergency room (ER)) and 2) development of severe disease.


Two hundred fifty-seven patients tested positive: 127 (49%), 96 (37%), and 34 (13%) at outpatient, ER and inpatient, respectively. Overall, 61% were age < 55; age > 75 was rarer in outpatient setting (11%) than ER (14%) or inpatient (24%). Most patients presented with cough (86%), fever/chills (76%), or fatigue (63%). 65% of inpatients reported shortness of breath compared to 30–32% of outpatients and ER patients. Ethnic/minority patients had a significantly higher risk of developing severe disease (Asian OR = 4.8 [1.6–14.2], Hispanic OR = 3.6 [1.1–11.9]). Medicare-insured patients were marginally more likely (OR = 4.0 [0.9–17.8]). Other factors associated with developing severe disease included kidney disease (OR = 6.1 [1.0–38.1]), cardiovascular disease (OR = 4.7 [1.0–22.1], shortness of breath (OR = 5.4 [2.3–12.6]) and GI symptoms (OR = 3.3 [1.4–7.7]; hypertension without concomitant CVD or kidney disease was marginally significant (OR = 2.3 [0.8–6.5]).


Early widespread symptomatic testing for COVID-19 in Silicon Valley included many less severely ill patients. Thorough manual review of symptomatology reconfirms the heterogeneity of COVID-19 symptoms, and challenges in using clinical characteristics to predict decline. We re-demonstrate that socio-demographics are consistently associated with severity.

Peer Review reports


During the week of February 23, 2020, community spread of the virus that causes COVID-19 was reported in California. On March 7 the Centers for Disease Control (CDC) reported 213 confirmed cases in the US [1]. Early data on COVID-19 symptoms is primarily from the inpatient setting and therefore skews towards the more severely ill, leaving many knowledge gaps around characteristics of the disease in the ambulatory population.

Health professionals have struggled to identify who is most at risk of severe disease, as COVID-19’s variable and wide-ranging symptoms make symptom-based diagnosis and prediction of progression to severe disease a challenge [2]. Instead, as community spread of COVID-19 has exponentially increased and testing has scaled up, race and ethnicity have consistently been found to predict hospitalization and mortality [1, 3, 4]. The complexities underlying these health disparities are uncertain, and undoubtedly include a mix of social, economic, access, and behavioral factors [3]. Moreover, certain comorbidities (including diabetes, heart disease, chronic kidney disease, and obesity) are now known to strongly predict COVID-19 hospitalization [5]; most of these comorbidities have disparate prevalence by race/ethnicity, but are not alone sufficient to explain racial/ethnic disparities in COVID-19’s impacts.

Amidst this complex interplay of factors, the ambulatory presentation of milder COVID-19 disease, and which characteristics might predict progression to severe disease, remains poorly understood. Widespread testing and clinical documentation of ambulatory cases is necessary to fill in our understanding of the spectrum of COVID-19 disease. In March 2020, Stanford University in Santa Clara County was the first California healthcare system to institute drive-through testing available to anyone in the outpatient setting with COVID-19 symptoms, regardless of their insurance status [6].

Stanford testing guidelines in March 2020 were relatively broad and symptom based, allowing testing of anyone with new-onset fever, cough, sore throat, or shortness of breath or flu-like symptoms in the preceding 14 days. By comparison, the CDC guidelines at the time limited testing to those with lower respiratory infection, travel to known high risk regions, and known or suspected close contact with COVID-19+ individuals. As a result of its criteria, in its first month of testing, Stanford included a broad population of less severely ill patients than most prior reported cohorts. Moreover, the tested cohort presents unique socio-demographic diversity, as Silicon Valley has both wide disparities in income and unique racial/ethnic diversity—Santa Clara and San Mateo counties’ have high population representation of Asian (39 and 30% respectively) and Hispanic/Latinx (36 and 24%), but relative under-representation of African American/Black (< 3%) [7, 8].

We conducted a detailed chart review of COVID-19 positive patients presenting in the first month of wide-scale symptomatic testing to explore the patterns of sociodemographic, co-morbid conditions, and symptomatology to further our understanding of the disease.



Stanford Health Care (SHC) is an academic health system in Silicon Valley. It began testing for COVID-19 on March 4, 2020 using a reverse-transcriptase polymerase chain reaction (RT-PCR) diagnostic test developed by Stanford’s clinical virology laboratory. The diagnostic test identifies the presence of viral RNA from nasopharyngeal swabs of potentially infected people, with an analytic sensitivity of 1 × 10–2 TCID50/mL and an analytic specificity of 100% [9, 10]. This retrospective chart review study was approved by Stanford’s Institutional Review Board (IRB 55757).

Patient population and data

We first identified all positive COVID-19 RT-PCR’s performed by Stanford laboratory, from March 4 to March 31, 2020. These represented a mix of patients seen at an SHC clinic/facility and those seen at external hospital systems that were using Stanford as reference laboratory. Date and location of testing, age at specimen collection, gender, race/ethnicity, insurance plan, and county of residence were electronically extracted from Stanford’s electronic medical record system. Only age and gender were available for patients tested at an external hospital; these patients were only included to assess how this broader group of patients compared to the more well-defined Covid patient population who sought testing at Stanford with respect to age and gender as a way to inform regional generalizability of the patients for whom we have more in-depth data.

We conducted a chart review on the subset whose tests were collected at SHC facilities, specifically, Stanford Hospital, Stanford ER, or one of ten Stanford primary care outpatient clinics including Stanford Express Clinic; patients whose test specimens were sent from a non-Stanford facility were excluded from the chart review (Fig. 1). Abstracted data included: potential source of exposure to COVID-19; symptoms; medical history; hospitalization; ICU admission; and death. Hospitalizations, ICU admission, and deaths were accessed through April 29, 2020. The chart review data were independently abstracted by two medical students using double data entry; a faculty physician then re-reviewed all charts and reconciled any inconsistencies.

Fig. 1
figure 1

Flow chart depicting total study cohort of COVID-19 positive patients and how the total chart review cohort was achieved. ER, Emergency Room

Data analysis

We calculated descriptive statistics overall and by location of presentation for all COVID-19 positive patients, including both those captured in external specimens sent from non-Stanford facilities as well as for the subset included in chart review, i.e. patients seen at a SHC clinic/facility. χ2 test, Fisher’s exact test, or Monte Carlo estimate for the Fisher’s exact test were used to compare differences between patients that presented at outpatient, ER, and inpatient.

We conducted logistic regression analyses on the chart review subset to assess factors associated with the odds of developing severe disease, defined as being hospitalized and/or death. Patients with insufficient data available to determine if they were hospitalized or died (i.e., no subsequent contact confirming recovery 10 days after positive test), were excluded from these analyses. In descriptive statistics and logistic regression models, we defined hypertension (HTN) to be in the absence of other cardiovascular disease (CVD) or stage 3+ chronic kidney disease (CKD), defined as pre-existing documentation of GFR < 60, in order to assess whether common HTN in the absence of more severe and often related complications/comorbidities, was associated with developing severe COVID-19.

Two regression models were fit separately examining associations with: 1) patient socio-demographics and disease severity; and 2) comorbidities and symptoms with disease severity. We combined statistically significant variables from (1) and (2) to form a semi-final model. Because there was strong clinical interest regarding the estimates for the remaining comorbidities and symptoms, we added each into the semi-final model individually to evaluate their associations; our goal with these analyses was to build upon the growing early literature to better describe patients most likely to develop severe disease. The final mode included all variables in the semi-final model plus any additional variables that were significant when added individually to it. Variables were not included in any of the models if > 10% of the data were missing (i.e, BMI and smoking) or they prevented model convergence (i.e., loss of taste and back pain). Confidence intervals excluding 1.0 are considered statistically significant in logistic regression models. Analyses were performed using SAS (version 9.4; SAS Institute, Inc., Cary, NC, USA).


All COVID-19+ patients

Eight hundred forty-two local patients tested positive for COVID-19 in March 2020 through Stanford Health Care’s laboratory; over half of which (54%) were specimens sent from non-Stanford facilities (Fig. 1). There were significant differences in the demographics of positive COVID-19 patients by point of access to testing (Table 1) for all characteristics examined. The outpatient care setting had the highest proportion of women, Caucasians and commercially insured patients. Outpatients skewed toward younger patient with only 5% 75 years or older. In contrast, the inpatient setting had the highest proportion of men, age over 65 years, and Medicare-insured patients. The ER setting had the highest proportion of younger (43% less than 40 years), Hispanic (32%) and Medicaid-insured patients (23%). The external patients most closely resembled the age and sex distribution of the inpatients, suggesting these were hospitalized patients with more severe disease at time of diagnosis. Regardless of point of access, the vast majority (72%) of patients lived in one of the two counties closest to the main Stanford campus, Santa Clara or San Mateo; ER patients were most likely to be from these local counties (85% compared to 69 and 62% of outpatients and inpatients, respectively).

Table 1 Demographic Characteristics of All COVID-19 Patients Diagnosed in March 2020 at Stanford

Chart review subset

There were 257 patients in the chart review analysis: 127 (49%) outpatient; 34 (13%) inpatient; and 96 (37%) ER (Fig. 1). The demographic characteristics of the chart review subset (Table 2) were similar to that of the larger cohort (Table 1). Approximately 41% of patients had contact with someone they knew to have Covid-19, 25% had traveled recently and 6% lived in a group living situation (Table 2). The inpatient setting, however, had the highest proportion living in a communal setting (18%, compared to 3 and 6% tested in the outpatient and ER settings, respectively (p < 0.05)). The ER testing setting had the highest proportion of patients with known contacts (46%) and the outpatient testing setting had the highest proportion of recent travelers (31%), however, neither of these differed significantly by testing access point.

Table 2 Demographic Characteristics of COVID-19 Patients Diagnosed in March 2020 at Stanford: Chart review subset

The most common comorbidities in all three groups were BMI ≥ 30, lung conditions, HTN, and ever smoking. Frequency of comorbidities varied little between those tested in the outpatient and ER settings aside from diabetes (8% outpatient vs 18% ER). In contrast, inpatients had the highest proportion of every comorbidity examined, except for gastrointestinal (GI) conditions (Table 3). The largest difference was proportion with stage 3+ CKD as defined by history of GFR < 60, 21% of those tested in the inpatient setting but only 2% in the ER and outpatient settings. All differences were statistically significant (p < 0.05) except lung conditions, immunosuppressive conditions, neurologic conditions, and GI conditions (Table 3).

Table 3 Medical History of COVID-19 Patients Diagnosed in March 2020 at Stanford: Chart review subset

Table 4 shows the frequency of symptoms overall and by testing access point. The most common symptoms in all patients were cough (86%), fever/chills (76%), and fatigue (63%). There was more variation in frequency of other symptoms across patient groups. Specifically, outpatients had the highest proportion of any of the other symptoms except pleuritic chest pain, shortness of breath, GI symptoms, and back pain. ER patients were most likely to present with pleuritic chest pain (35% vs 24%; not significant), whereas inpatients were most likely to present with shortness of breath (65% vs 30–32%; p < 0.001) or GI symptoms (47% vs 22–24%; p < 0.05). Loss of taste and back pain/ache were the least common presenting symptoms investigated, 9 and 3%, respectively.

Table 4 Symptoms at Presentation for Testing of COVID-19 Patients Diagnosed in March 2020 at Stanford: Chart review subset

Severe disease

A total of 41 patients experienced severe disease, defined as hospitalization or death. One outpatient died at home and four others were later hospitalized. Eight ER patients were later hospitalized, one of whom died. Two of the patients initially tested as inpatients died for a total of four deaths in the cohort. Table 5 shows the results of the logistic regression models. Five patients originally tested in the ER and two from the outpatient setting were lost to follow-up so were excluded from this analysis, the remaining 250 patients were included. The final model output indicates gender, race/ethnicity, and insurance were associated (or marginally associated) with developing severe disease after adjusting for comorbidities and symptoms. Specifically, male relative to female (OR = 2.2; 95% CI: 1.0, 4.7), Asian or Hispanic relative to Caucasian (OR = 4.8; 95% CI: 1.6, 14.2 and OR = 3.6; 95% CI: 1.1, 11.9, respectively), and Medicare insurance relative to commercial insurance (OR = 4.0; 95% CI: 0.9, 17.8) were the patient groups with statistically significant or marginally significant associations. Notably, age was not significantly associated after adjusting for insurance status, however, equally important is that 20% of patients aged 65+ had commercial insurance, thus older patients without commercial insurance had the highest risk of severe disease.

Table 5 Logistic Regression Models for Severe COVID-19 Disease (inpatient admission or death)

CVD (OR = 4.7; 95% CI: 1.0, 22.1), HTN (OR = 2.3; 95% CI: 0.8, 6.5), and stage 3+ kidney disease (OR = 6.1; 95% CI: 1.0, 38.1), were all marginally statistically significantly related to development of severe disease in the final model (Table 5). Shortness of breath (OR = 5.5; 95% CI: 2.3, 13.0) and GI symptoms (OR = 3.3; 95% CI: 1.4, 7.7) were the only presenting symptom associated with developing severe disease. No other comorbidities or symptoms approached statistical significance.


This study examined the first cohort of ambulatory COVID-19 positive patients in the Silicon Valley region of California—one of the first communities in the United States to scale up community testing in response to early community spread. In exploring patient demographics, comorbidities and symptoms in relationship to their location of presentation and the development of severe disease, we reiterate the growing evidence of socio-demographic disparities in COVID-19’s impacts. Our work was motivated by desire to identify clinical predictors of progression to more severe disease, but instead we find that race/ethnicity and insurance predict risk of hospitalization at the same or similar order of magnitude as the most predictive comorbidities and symptoms. Thus, our data once again re-tells a story of racial inequity in health outcomes, but with specific local flavor. Silicon Valley’s diversity includes large representation of Asian and Latinx populations, and while these groups were slightly under-represented compared to the local population, they still each represented a fifth of our study population. Notably, we had too few African American patients in our cohort to meaningfully analyze.

Our findings are both consistent and distinct with similar work looking at the broader San Francisco Bay area, done by the Sutter Health Care system-- both studies demonstrate racial disparity in COVID-19, but highlight different affected minority communities [4]. Whereas theirs and prior work demonstrated 3-fold risk of hospitalization for African American COVID-19 patients, our cohort demonstrates a significant approximately 3-fold odds of hospitalization in Asian and Latinx patients. In particular, we believe this marked increased risk of hospitalization in Asian patients is a novel finding, made possible by the relatively large representation in our local population.

The fact that socio-demographic factors – including race/ethnicity and insurance type/status – were associated with severity likely reflects the confluence of multiple underlying disparities including social, economic, access, and behavioral factors [3, 4]. These might manifest as barriers to timely presentation to care and influence where patients eventually access care. This hypothesis is supported by our observation that the location of presentation (outpatient, inpatient, or emergency room), was most strikingly different by insurance status and race/ethnicity.

Latinx patients with COVID-19 were most likely to present to ER or inpatient settings. Patients with commercial insurance were most likely to present at an outpatient location while patients without insurance or with Medicaid were most likely to have their COVID-19+ status captured in the ER or once inpatient. Many factors might contribute to these differences, including familiarity with ways to rapidly access outpatient appointments, having a primary care physician, language and technical barriers to scheduling, cultural norms, perceptions of insurance requirements in different locations, and gaps in communication and knowledge of Stanford’s broadened access offered (i.e., accepting all-payers and uninsured) for outpatient COVID-19 testing.

There were notably fewer differences amongst location of presentation in terms of presenting symptom and co-morbidities; our study provides important descriptions of both, but also reiterates the diversity of symptoms that make clinical prediction a challenge for this disease. The two symptoms identified were shortness of breath/dyspnea and GI symptoms, which unsurprisingly (as respiratory distress is frequent trigger for hospital admission as is dehydration from severe vomiting and/or diarrhea) was more common in hospitalized patients (65 and 47%, respectively) than outpatient (30 and 24%, respectively) or emergency room (32 and 22%, respectively) presentations.

In our cohort, the comorbidities we observed to predict severe outcomes were consistent with the CDC’s recent update from July 13, 2020, and we re-confirm a strong association with underlying CVD (OR 4.7) and chronic renal disease (OR 6.1) [5]. In the same update, the CDC lists hypertension as having “mixed evidence” for severe disease. Our data, similarly found marginal evidence for hypertension alone as independent predictor of worse outcomes with odds ratio 2.3. We purposefully defined our hypertension variable to capture the more common cases of hypertension, uncomplicated by CVD or CKD. This is important, given hypertension is one of the most prevalent conditions in the US, affecting one-third of adults [11]. Our findings from our cohort of predominantly ambulatory COVID-19 patients supports parallel findings seen in the predominantly inpatient data from meta-analysis, and together reinforce that HTN alone may be an independent risk factor for developing severe disease [12]. Our lack of evidence of increased risk with asthma, neurologic conditions, or diabetes should more likely be interpreted to reflect our relatively small sample size, rather than evidence against their plausible association with severe COVID-19.

Limitations and strengths

Despite broad testing criteria and early ramp-up of testing in our local system, the cohort included in our one-month chart-review was of relatively small sample size, which was particularly limiting in our ability to examine comorbidities with somewhat lower prevalence (e.g., neurologic conditions) and data with high missingness (e.g. BMI and tobacco use). Further, the limitations on testing availability in March led to a selection bias. In our clinical experience, when testing was limited, many younger and healthier patients were assumed positive and not tested. Outpatient testing was prioritized for older persons, which may bias some of our estimates.

Strengths of our study include our methodology of rigorous manual chart review, which allowed for comprehensive identification of both comorbidities and symptoms. The accuracy and completeness of these factors is beyond prior COVID-19 studies which has relied on diagnostic codes or used natural language processing to estimate COVID-19 symptomatology [2, 4]. Our race and ethnicity was data had relatively low rate of missingness, an issue that has been increasingly identified as a barrier to understanding the true extent of disparities in COVID-19 [13, 14]. The study also had high follow-up for outcomes of disease severity at 97% of all patients.


When and how care is accessed and the outcomes for COVID-19 severe disease is affected by ethnicity and insurance type. We reiterate the disproportionate impact of COVID-19 on minority populations and specifically find that in a largely ambulatory population, in a region with large Asian and Latinx representation, that both of these race/ethnicity groups were associated with more severe cases of COVID-19. We also find further marginal evidence to support the to-date uncertain association of hypertension (independent of renal or more severe cardiovascular disease) with more severe COVID-19 disease [5].

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available because they are governed by the data stewards of Stanford Health Care. They may be available upon reasonable request pending review by the Privacy Office.



Body mass index


Centers for Disease Control


Chronic kidney disease


Cardiovascular disease


Emergency room




Glomerular filtration rate




Odds ratio


Real-time reverse transcription polymerase chain reaction


Stanford Health Care


  1. Centers for Disease Control and Prevention. Updated Guidance on Evaluating and Testing Persons for Coronavirus Disease 2019 (COVID-19). Available at: Accessed 13 July 2020.

  2. Medium. Estimating the feasibility of symptom based classification of COVID-19. Available at: Accessed 20 Mar 2020.

  3. Hooper MW, Nápoles AM, Pérez-Stable EJ. COVID-19 and racial/ethnic disparities. J Am Med Assoc. 2020;323(24):2466–7.

    Article  CAS  Google Scholar 

  4. Azar KMJ, Shen Z, Romanelli RJ, et al. Disparities in outcomes among COVID-19 patients in a large health care system in California. Health Aff. 2020;39(7):1253–62.

    Article  Google Scholar 

  5. Centers for Disease Control and Prevention. Evidence used to update the list of underlying medical conditions that increase a person’s risk of severe illness from COVID-19. Available at: Accessed 13 July 2020.

  6. Artandi M, Thomas S, Shah NR, Srinivasan M. Rapid system transformation to more than 75% primary care video visits within three weeks at Stanford: response to public safety crisis during a pandemic. NEJM Catal Innov Care Deliv. 2020.

  7. United States Census Bureau. U.S. Census Bureau QuickFacts: Santa Clara County, California. Available at: Accessed 24 July 2020.

  8. United States Census Bureau. U.S. Census Bureau QuickFacts: San Mateo County, California. Available at: Accessed 24 July 2020.

  9. Centers for Disease Control and Prevention. Health, United States, 2018 – Data Finder. Available at: Accessed 30 Mar 2020.

  10. Hologic. SARS-CoV-2 Assay (Panther Fusion® System). Available at: Accessed 25 July 2020.

  11. Centers for Disease Control and Prevention. Selected health conditions and risk factors, by age: United States, selected years 1988–1994 through 2015–2016. Available at: Accessed 29 Oct 2019.

  12. Pranata R, Lim MA, Huang I, Raharjo SB, Lukito AA. Hypertension is associated with increased mortality and severity of disease in COVID-19 pneumonia: a systematic review, meta-analysis and meta-regression. J Renin-Angiotensin-Aldosterone Syst. 2020;21(2):1–11.

    Article  Google Scholar 

  13. Oppel RA Jr, Gebeloff R, Lai KKR, Wright W, Smith M. The Fullest Look Yet at the Racial Inequity of Coronavirus. The New York Times; 2020.

    Google Scholar 

  14. Hellmann J. Frustrations grow over incomplete racial data on COVID-19 cases, deaths. The Hill. 2020.

    Google Scholar 

Download references


We would like to thank Oliva Jee, Aviva Mattingly, Sophia Bechek, Clare Wise, Sydney Hemphill, Yousi Oquendo, Maite Van Hentenryck, Jeffrey Edwards, Tanmaya Sambare, and Jacqueline Tao for their contribution reviewing patient charts. We also thank Jenna Lange for her help coordinating efforts early on.


This study was funded by the Stanford School of Medicine Division of Primary Care and Population Health. No public funding was utilized to conduct the study.

Author information

Authors and Affiliations



LV conceptualized and designed the study, oversaw and helped design and conduct the chart review, interpreted data analysis, and drafted the manuscript; DV developed the Redcap database to capture chart review data, analyzed the data, interpreted data analysis and contributed to manuscript drafts; JGS interpreted data analysis and contributed to manuscript drafts; NL helped design and conduct the chart review, interpreted data analysis, and contributed to manuscript drafts; LE helped conduct the chart review, interpreted data analysis, and contributed to manuscript drafts; MW conceptualized and designed the study, oversaw design of the chart review and data analysis, interpreted data analysis, and drafted the manuscript. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Marcy Winget.

Ethics declarations

Ethics approval and consent to participate

This study was approved by Stanford’s Institutional Review Board (IRB 55757). Requirement for participant consent was waived.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vaughan, L., Veruttipong, D., Shaw, J.G. et al. Relationship of socio-demographics, comorbidities, symptoms and healthcare access with early COVID-19 presentation and disease severity. BMC Infect Dis 21, 40 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: