Novel risk scoring system for predicting acute respiratory distress syndrome among hospitalized patients with coronavirus disease 2019 in Wuhan, China

Background The mortality rate from acute respiratory distress syndrome (ARDS) is high among hospitalized patients with coronavirus disease 2019 (COVID-19). Hence, risk evaluation tools are required to immediately identify high-risk patients upon admission for early intervention. Methods A cohort of 220 consecutive patients with COVID-19 were included in this study. To analyze the risk factors of ARDS, data obtained from approximately 70% of the participants were randomly selected and used as training dataset to establish a logistic regression model. Meanwhile, data obtained from the remaining 30% of the participants were used as test dataset to validate the effect of the model. Results Lactate dehydrogenase, blood urea nitrogen, D-dimer, procalcitonin, and ferritin levels were included in the risk score system and were assigned a score of 25, 15, 34, 20, and 24, respectively. The cutoff value for the total score was > 35, with a sensitivity of 100.00% and specificity of 81.20%. The area under the receiver operating characteristic curve and the Hosmer–Lemeshow test were 0.967 (95% confidence interval [CI]: 0.925–0.989) and 0.437(P Value = 0.437). The model had excellent discrimination and calibration during internal validation. Conclusions The novel risk score may be a valuable risk evaluation tool for screening patients with COVID-19 who are at high risk of ARDS. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-020-05561-y.


Background
In late December 2019, an outbreak of pneumonia of unknown etiology first occurred inWuhan, China, and rapidly spread worldwide [1,2]. Later studies confirmed that the disease was caused by a novel coronavirus referred to as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [3,4]. Then, in February 2020, this emerging infectious disease was officially named as coronavirus disease 2019 (COVID-19) by the World Health Organization (WHO) [5,6]. Although most patients diagnosed with COVID-19 experience mild symptoms, 19% could develop severe or fatal symptoms and present with intractable conditions, particularly ARDS [7]. With respect to its definition, the pathogenesis of ARDS involves rapidly progressing respiratory failure from non-cardiogenic pulmonary edema, which may require mechanical ventilation due to severe hypoxia and difficulty in breathing [8]. Currently, the Berlin definition for ARDS is utilized, and it recommends the use of three categories to differentiate severity based on partial pressure of oxygen (PaO2)/ initial fraction of inspired oxygen (FiO2) [9]. According to a recently published study, the mortality rate of patients with COVID-19 who present with ARDS is > 70% [10]. Moreover, the current guidelines for the treatment of ARDS focus on lung-protective ventilation and fluid conservative management, and early interventions were found to have a better therapeutic effect [11]. Thus, to facilitate the early identification of high-risk patients and prevent the development of or reduce the severity of ARDS, the predictors of this condition must be determined. Recent studies have identified several predictors for the unfavorable outcomes of COVID-19 [12][13][14]. However, only few predictors of the onset of ARDS were determined. Thus, the key indicators for the risk of ARDS should be immediately identified upon admission. Moreover, risk evaluation models that use a combination of risk factors are likely to increase the power of prediction. Some risk scoring systems have been developed for several clinical conditions, including coronary heart disease [15], heart failure [16], and stroke [17], and these systems were found to have great practical value. Thus, a systemic evaluation tool involving risk scores, which can be practical for clinicians, is urgently needed. In this study, we obtained data from 220 patients with confirmed COVID-19 who died in or were discharged from the isolation ward of the Department of Respiratory and Critical Care Medicine at Wuhan Union Hospital between January 10, 2020, and March 5, 2020. The current study aimed to explore the risk factors associated with ARDS in patients with COVID-19 and develop a risk evaluation system for predicting ARDS.

Study design and data collection
This was a retrospective, observational cohort study performed at the isolation ward of the Department of Respiratory and Critical Care Medicine, Wuhan Union Hospital (Huazhong University of Science and Technology, Wuhan, China). We included all adult patients who were diagnosed with COVID-19, according to the WHO interim guidance, and who were discharged or died between January 10, 2020, and March 5, 2020. Since Wuhan Union Hospital has been a designated hospital for treating patients with COVID-19 since January 10, 2020, the population constituted a representative sample of all patients with COVID-19 seeking treatment. This study was approved by the research ethics commission of Wuhan Union Hospital (KY-2020-0040), and the need for informed consent was waived. Demographic, clinical, laboratory, and imaging data were extracted from the electronic medical record system of Wuhan Union Hospital through a standardized data collection form. All data were checked by two physicians (YM and JT), and a third researcher (MH) reviewed and made corrections to any differences in data.

Case definition
COVID-19 was confirmed based on the examination of respiratory specimens using real-time reverse transcription polymerase chain reaction (RT-PCR). The examination was performed at the clinical laboratory of Wuhan Union Hospital, which is a qualified institution for nucleic acid testing. Patients who were discharged met all of the following criteria: 1) absence of fever for at least 3 days, 2) notable improvement of findings on chest computed tomography (CT) scan, 3) remission of respiratory symptoms, and 4) two continuous negative results of SARS-CoV-2 RNA using throat swab samples collected at least 24 h apart.
Fever was defined as an axillary temperature of at least 37.3°C. ARDS was diagnosed according to the Berlin Definition [9]. In brief, patients who experienced acute respiratory failure not fully explained by cardiac failure or fluid overload, with PaO2/FiO2 ≤ 300 mmHg and positive end expiratory pressure or continuous positive airway pressure ≥ 5 cm H2O, and who present with bilateral opacities on chest radiography not fully explained by effusions, lobar or lung collapse, or nodules are diagnosed with ARDS [8,9].

Clinical examinations and treatments
Routine blood examinations, including complete blood count, liver function, blood lipids, fasting blood glucose, kidney function, uric acid, lactate dehydrogenase, creatine kinase, and assessment of myocardial enzymes, coagulation profile, serum C-reactive protein (CRP) level, erythrocyte sedimentation rate (ESR), serum procalcitonin (PCT) level, and ferritin level, were performed upon patient admission, and re-examination was conducted at least once every 3 days during hospitalization. All patients who were admitted underwent chest CT scans.
Patients admitted to the isolation ward received standard treatment according to the Chinese management guidelines for COVID-19 (version 6.0) [18]. In brief, the antiviral treatment included interferon alpha inhalation (50 μg twice daily), lopinavir, and ritonavir (400 and 100 mg twice daily, respectively), and arbidol (200 mg twice daily). Treatment with corticosteroid (40-80 mg/day) and gamma globulin (15-20 g/day) was initiated if patients presented with a resting respiratory rate > 30 per min, oxygen saturation < 93% but without the need for supplemental oxygen, or chest radiography showing > 50% progression within 48 h. Oral and intravenous antibiotics were administered if there was a high risk of concomitant bacterial infection.

Statistical methods
Data obtained from approximately 70% of the participants (n = 154) were randomly selected and used as training dataset to establish a logistic regression model. Meanwhile, data obtained from the remaining 30% of the patients (n = 66) were used as test dataset to validate the effect of the model. Using the training dataset, continuous and categorical variables were presented as median with interquartile range (IQR) and total number (n) with percentage (%), respectively. The Mann-Whitney U test, χ2 test, and Fisher's exact test were used accordingly to compare the differences between patients with and without ARDS. Univariate and multivariate logistic regression models were used to test the association between the risk factors and onset of ARDS. A receiver operating characteristic (ROC) curve was established to depict the predictive ability of the variables. The Youden index, which was calculated as the sum of the sensitivity and specificity minus 1, was used to determine the optimal cutoff value. The area under the curve (AUC) was calculated with the ROC curves to determine the differentiating abilities of the corresponding risk factors. Variables with significant differences between patients with and without ARDS were included in the multivariate logistic regression analysis, and a stepwise selection method was used to identify the variables included in the predictive model. The assigned risk score for the corresponding variables was determined by multiplying the β coefficients of significant variables by 10 and rounding off the value to the nearest integers, and the total risk score was calculated as the sum of those of the individual risk factors. In the verification dataset, the model was evaluated using the Hosmer-Lemeshow test and ROC curve. All data analyses were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC, the USA). Two-tailed P values < 0.05 for all tests were considered statistically significant. P values < 0.1 were used as the selection criteria for variables in the model.

Results
Of the 154 COVID-19 patients whose data were used to establish the model, 37 (24.03%)developed ARDS during hospitalization. Data about the characteristics of the study population collected upon admission and grouped according to the diagnosis of ARDS are presented in Table 1.
The median age of the patients was 57.5 (range: 21-96, IQR: 45-75) years. The median age of the ARDS group was significantly higher than that of the non-ARDS group (69 [IQR: 62-74] vs 54 years [IQR 41-65] years). Approximately 53.90% of the patients were men. The male patients accounted for a larger proportion in the ARDS group (70.27%) than in the non-ARDS group (48.72%). About 13.64% of the patients were smokers, and the proportion of smokers was larger in the ARDS group (24.32%) than in the non-ARDS group (10.26%). The patients in the ARDS and non-ARDS groups did not differ significantly in terms of weight and clinical features upon admission. With respect to clinical features, fever was the most common symptom upon admission that was observed in 89.61% of the total population. The median time from symptoms onset to hospitalization in the ARDS group and non-ARDS group was 8 days and 10 days, respectively and there was no statistical difference. Moreover, patients with comorbidities were more likely to develop ARDS than patients without comorbidities (72.97% vs. 35.04%). The comorbidities included hypertension (43.24% vs 23.08%), diabetes (35.14% vs. 12.82%), coronary heart disease (CHD, 16.22% vs. 6.84%), and malignancies (10.81% vs. 1.71%). However, the difference was not significant in terms of the prevalence of CHD.
In terms of the results of the laboratory examinations performed upon admission, COVID-19 patients with ARDS had significantly higher levels of white blood cell  Regarding other important indicators such as PaO2 and PaO2/FiO2 at admission, We only recorded the above indicators for 29 patients, of which 28 patients were ARDS patients and 1 was non-ARDS patient. The median PaO2 of ARDS patients is 64 mmHg, PaO2/FiO2 is 101 mmHg.the Non-ARDS patient is 81 mmHg and 245 mmHg, respectively. Although the ARDS group has worse indicators, the data is insufficient, resulting in data differences between the two groups that are not comparable.
The mortality in the ARDS group was 40.54%, while the mortality in the non-ARDS group was much lower than that of the ARDS group, which was 0.85%. This indicates that the ARDS group has a higher risk of death, and also shows the importance of the risk assessment model, which provides a basis for early detection and early treatment. Table 2 shows the predictive efficiency of continuous variables that were differentially distributed between the ARDS and non-ARDS groups. The cutoff criteria were selected with the best Youden indices, and the associated sensitivities, specificities, positive likelihood ratios (+LR), and negative likelihood ratios (−LR) were also presented. The AUC values were determined using the ROC curves, as shown in Figure S1. Among the continuous risk factors, ferritin level had the best differentiating ability, with an AUC of 0.872, and the associated cutoff value was > 950 ng/mL.
The results for the association between differentially distributed risk factors and the risk of ARDS are shown in Table 3. The unadjusted models were first used to analyze the risk of individual variables. Then, the multivariate logistic regression model was fitted with a stepwise selection method, and LDH, BUN, D-dimer, PCT, and ferritin levels were selected and used in the combined prediction model. Specifically, COVID-19 patients with LDH values ≥295 U/L were assigned a score of 25; those with BUN values > 5.12 mmol/L, a score of 15; D-dimer values ≥5.00 μg/mL, a score of 34; PCT values > 0.11 ng/mL, a score of 20; and serum ferritin values ≥950 ng/mL, a score of 24. The total risk scores were calculated as the sum of the scores of the individual risk factors. The ROC curves of the total risk score for predicting ARDS are depicted in Fig. 1, and the associated sensitivities, specificities, +LRs, and -LRs of each cutoff point are presented in Table S1. In the current study cohort, the optimal cutoff value of the ARDS risk score was > 35, with a sensitivity of 100% and a specificity of 81.20%. The AUC of the ARDS score based on the ROC curve was 0.967 (95% CI: 0.925-0.989).
In this study, data obtained from the remaining 30% of the patients were used as validation dataset, In total, 17 of 66 patients had ARDS. The sensitivity of the model was 71.43% and specificity was 78.85%. The AUC of the test sample model was 0.819 (95% CI: 0.680-0.959). The P value for the Hosmer-Lemeshow test was 0.312, indicating that the model had a good fit (Table 4).

Discussion
This is a retrospective study of patients diagnosed with COVID-19 and admitted to Wuhan Union Hospital. By comparing patients with and without ARDS, a panel of risk factors were identified. The univariate logistic regression model was used to assess the risk of individual factors, and the multivariate logistic regression model was utilized to identify the factors for the risk prediction model, which include LDH, BUN, D-dimer, PCT, and serum ferritin levels. Moreover, by assigning a risk score for the significant factors and calculating the total risk score, COVID-19 patients with a high risk of ARDS during hospitalization could be identified and this risk evaluation system had a good predictive efficiency.
Among the variables included in the prediction model, the D-dimer level was assigned with the largest risk score at 34. Although the underlying mechanism of COVID-19 is still unknown, patients with this condition have an increased risk of thrombosis (preliminary data not shown). Bronchoscopy showed red jelly-like sputum, and biopsy of the lung tissues revealed disseminated hemorrhage in the pulmonary alveoli and clot formation within the microvessels [19,20]. This result emphasizes the role of blood clotting dysfunction, as indicated by elevated D-dimer levels. Intra-alveoli hemorrhage and intravascular thrombosis are bound to reduce gas exchange function, and this leads to severe hypoxia, which is considered a key manifestation of ARDS [8].
In previous studies, an elevated ferritin level was considered a risk factor for the severity of different types of infection [21][22][23]. Although several studies have compared the difference in risk factors between patients with favorable and unfavorable outcomes, sufficient attention has not been paid to ferritin levels [10,13,24]. One recent study reported that the non-survivors of COVID-19 had higher serum ferritin levels than survivors [25]. By contrast, the formation of toxic hydroxyl radicals from superoxide anions and hydrogen peroxide requires free iron, the storage of which correlates with that of ferritin. In contrast, proinflammatory cytokines, including interleukin-1β (IL-1β), interleukine-6 (IL-6), and tumor necrosis factor-α, can directly increase the synthesis of ferritin [26].
ARDS is characterized by acute, diffuse, inflammatory lung injury; and inflammatory markers, including CRP, ESR, PCT, and serum ferritin, may worsen the clinical symptoms [27,28]. In addition to serum ferritin, PCT was included in the prediction model. Although ferritin is considered a marker of tissue inflammation, PCT is more commonly considered as an indicator of bacterial infection [29,30]. This result indicates that concomitant bacterial infection may play an important role in the progression of ARDS among patients with COVID-19, and this finding is in accordance with a previous hypothesis on the pathogenesis of the disease [31][32][33]. Thus, for high-risk patients with elevated PCT levels, treatment with antibiotics may be effective in preventing ARDS. Although we have not obtained precise etiological evidence, we found that the Procalcitonin in the ARDS group was higher and exceeded the normal range, accompanied by an increase in White blood cell count, C-Reactive protein, ESR, Ferritin and a decrease in lymphocyte count. This also suggests that it seems that patients in the ARDS group have worse immunity and a higher risk of bacterial/fungal co-infection.
Since SARS-CoV2 attack pulmonary epithelial cells and there is a risk of bacterial infection, the release of intracellular LDH, which is considered a general index of cell injury, is bound to increase [34,35]. Moreover, recent studies have revealed that lactate may suppress the function of immune cells, and LDH can be an indicator of immunosuppression [36,37]. Thus, special attention must be paid to this finding, as the current research and previous reports [10,25] have shown that a decreased number of lymphocytes can be associated with unfavorable outcomes among patients with COVID-19. This result indicated that immune suppression had an important role in disease prognosis. However, in our model, the addition of lymphocytes does not improve the predictive efficiency of the model, indicating that the effect of lymphocytes on ARDS may be explained by other variables, including LDH.
Moreover, the importance of BUN in predicting ARDS has not received special attention. As severe infection and tissue damage can increase the rate of protein degradation [38], patients with ARDS had elevated BUN levels. Moreover, the cutoff value for BUN levels in predicting ARDS is > 5.12 mmol/L, which is still within the normal range. BUN levels higher than the normal range can indicate > 50% loss of renal function, and this may  emphasize the preservation of renal function in patients with COVID-19, which plays a decisive role in fluid control for treatment of patients with ARDS [27]. With respect to the specific scores assigned to the individual risk factors, these variables may divided into groups with high scores, including those of LDH, D-dimer, and ferritin levels, with scores of 25, 34, and 24, respectively, and groups with low scores, including those of BUN and PCT, with scores of 15 and 20, respectively. As the optimal cutoff value is > 35, a positive result for any single factor is not sufficient to identify high-risk patients, and any combination of the three factors is sufficient. However, when evaluation is based on two variables, one should be included in the high-score group. This study had several strengths. First, Wuhan Union Hospital is one of the first designated institutions for treating patients with COVID-19 in Wuhan. Thus, the participants constituted a representative sample of hospitalized patients. Moreover, this study had one of the largest populations with definite outcomes during hospitalization, thereby providing a strong evidence for depicting the risk factors of ARDS among patients with COVID-19. By contrast, after a systemic selection of possible risk factors, a panel of indices routinely tested in clinical settings were selected, and a risk evaluation score was designed with a relatively good predictive efficacy for practical use. However, this study also had several limitations. First, the interpretation of the results may be limited by the relatively small sample size.  Second, studies involving external verification must be conducted to validate the efficacy of our model in predicting ARDS. Finally, Due to the heavy clinical work during the epidemic, some important indicators such as PaO2 and PaO2/FiO2 have not been recorded in time, so these indicators cannot be included in our research.

Conclusions
Several variables were found to be differentially distributed between COVID-19 patients with and without ARDS. A rough stepwise selection method and a panel of risk factors, including LDH, BUN, D-dimer, PCT, and ferritin levels, were included in the prediction model, and a risk evaluation scoring system was established, with an optimal cutoff value > 35 and AUC of 0.967 (95% CI: 0.925-0.989). Moreover, the model had excellent discrimination and calibration during the internal validation, which is considered practical for clinicians. The novel risk scoring system may be a valuable tool for screening COVID-19 patients with a high risk of ARDS.