Severity-associated markers and assessment model for predicting the severity of COVID-19: a retrospective study in Hangzhou, China

Background The severity of COVID-19 associates with the clinical decision making and the prognosis of COVID-19 patients, therefore, early identification of patients who are likely to develop severe or critical COVID-19 is critical in clinical practice. The aim of this study was to screen severity-associated markers and construct an assessment model for predicting the severity of COVID-19. Methods 172 confirmed COVID-19 patients were enrolled from two designated hospitals in Hangzhou, China. Ordinal logistic regression was used to screen severity-associated markers. Least Absolute Shrinkage and Selection Operator (LASSO) regression was performed for further feature selection. Assessment models were constructed using logistic regression, ridge regression, support vector machine and random forest. The area under the receiver operator characteristic curve (AUROC) was used to evaluate the performance of different models. Internal validation was performed by using bootstrap with 500 re-sampling in the training set, and external validation was performed in the validation set for the four models, respectively. Results Age, comorbidity, fever, and 18 laboratory markers were associated with the severity of COVID-19 (all P values < 0.05). By LASSO regression, eight markers were included for the assessment model construction. The ridge regression model had the best performance with AUROCs of 0.930 (95% CI, 0.914–0.943) and 0.827 (95% CI, 0.716–0.921) in the internal and external validations, respectively. A risk score, established based on the ridge regression model, had good discrimination in all patients with an AUROC of 0.897 (95% CI 0.845–0.940), and a well-fitted calibration curve. Using the optimal cutoff value of 71, the sensitivity and specificity were 87.1% and 78.1%, respectively. A web-based assessment system was developed based on the risk score. Conclusions Eight clinical markers of lactate dehydrogenase, C-reactive protein, albumin, comorbidity, electrolyte disturbance, coagulation function, eosinophil and lymphocyte counts were associated with the severity of COVID-19. An assessment model constructed with these eight markers would help the clinician to evaluate the likelihood of developing severity of COVID-19 at admission and early take measures on clinical treatment. Supplementary Information The online version contains supplementary material available at 10.1186/s12879-021-06509-6.


Background
Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and has spread worldwide [1]. On March 12, 2020, the World Health Organization (WHO) announced the disease to be pandemic. It has affected more than 200 countries with about 10,000,000 confirmed cases as of July 01, 2020 [2]. Therefore, the epidemic of COVID-19 has become a global public health crisis.
Different clinical patterns, such as mild, moderate, and severe to critical types, were observed in patients with COVID-19. Although most COVID-19 patients have mild or moderate symptoms and signs, the finding from China indicated that about 14% of patients were of the severe type and 5% were of the critical type [3]. Previous studies and clinical practice showed that the degree of severity was associated with the clinical treatment and prognosis of the disease [3][4][5][6]. The average overall case-fatality rate of confirmed COVID-19 patients was 2.3%, but that was up to 49.0% in critical patients [3]. Missed diagnoses will delay the appropriate clinical treatment and increase the possibility of poor prognosis. On the other hand, treatment for a severe or critical COVID-19 patient requires vast medical resources, and over misdiagnoses will overuse the medical resources and increase the medical burden. Therefore, early identification of patients who are likely to develop severe or critical COVID-19 is especially important for clinical practice and epidemic control. In clinical practice, the severity of COVID-19 is categorised into four levels as mild, moderate, severe, and critical types according to the Seventh Edition of the Guide to Diagnosis and Treatment of New Coronary Pneumonia [7]. This classification is preformed mainly based on the clinical symptoms, oxygen saturation (SaO2), and imaging evidence from computed tomography (CT). However, no evidence from laboratory markers has been included. Previous studies have found that lymphopenia, organ dysfunction, coagulopathy, and elevated D-dimer levels were associated with the severity [3][4][5][6]8].
In this study, we aimed to screen severity-associated markers and construct an assessment model for predicting the severity of patients with COVID-19 based on the data from two hospitals in Hangzhou, Zhejiang province, China.

Study population
This study enrolled 172 confirmed COVID-19 patients from January 20, 2020 to April 1, 2020 in Hangzhou, Zhejiang Province, China. Among these patients, 104 from Hangzhou Xixi Hospital were used for screening the severity-associated markers and constructing the assessment model as a training set. Part of the 104 patients had been used in the previously published studies [9,10]. On the other hand, 68 patients from the First Affiliated Hospital, School of Medicine, Zhejiang University (FAHZJU) were used to validate the model as a validation set. These patients were part of the sample which had been published previously [10]. COVID-19 was diagnosed according to the interim guidance from the WHO [11]. The severity of COVID-19 was categorised into four levels according to the Seventh Edition of the Guide to Diagnosis and Treatment of New Coronary Pneumonia [7]. The mild type was defined as patients with mild clinical symptoms and normal imaging on CT. The moderate type was defined as patients with fever, respiratory symptoms, or other symptoms, and altered imaging evidence with pneumonia. The severe type was defined as patients with at least one of the following symptoms: shortness of breath (breathing rate ≥ 30/min), SaO 2 at rest ≤ 93%, partial pressure of oxygen in arterial blood (PaO 2 )/ inspired oxygen fraction (FiO 2 ) ≤ 300 mmHg, or lung infiltrates > 50% within 24 to 48 h. The critical type was defined as patients with any of the following symptoms: respiratory failure requiring mechanical ventilation, shock, or a combination of other organ failures requiring ICU monitoring treatment.
This was a retrospective study and the protocol was approved by the Ethics Committee of Xixi Hospital and FAHZJU.

Data collection
Data at admission, including demographic information, comorbidities, clinical symptoms and laboratory tests, were extracted from electronic medical records. Collected data were reviewed by a trained team of clinical physicians. Demographic information included age, sex and body mass index (BMI). Comorbidity was defined as having at least one shortness of breath, diarrhoea and myalgia. Laboratory markers of laboratory tests included the following eight categories: inflammation, electrolytes, nutritional metabolism, and liver, renal, cardiac, respiratory, coagulation functions.

Statistical analysis
Continuous variables were presented as median (interquartile range [IQR]), and categorical variables were presented as numbers (percentage). Continuous laboratory markers were dichotomously categorised (normal versus abnormal) under the criteria of their clinical reference values. Severity-associated markers of COVID-19 were screened using the ordinal logistic regression.
To construct an assessment model, two criteria were set for selecting markers: P value < 0.05 in the ordinal logistic regression, and at least half of severe or critical patients had an abnormality in the marker. Least Absolute Shrinkage and Selection Operator (LASSO) regression was used for further feature selection. Optimal regularization parameter (λ) was estimated by fivefold cross-validation. To increase the stability of feature selection, we used bootstrap with 1000 resamples and built a LASSO regression model for each bootstrap set. The markers, which were present in more than half of all bootstrap sets, were included in the final model. Assessment models were constructed using logistic regression, ridge regression, support vector machine, and random forest in the training set. The performance of different models was evaluated by the area under the receiver operator characteristic curve (AUROC). For the internal validation, we used bootstrap with 500 resamples to decrease the over-fitting. For the external validation, four models were assessed in the validation set, respectively. A risk score was established according to the result of the best model. The performance of the risk score in all patients was evaluated using AUROC and calibration curve. The optimal cutoff value was calculated with the maximal Youden index. A web-based assessment system was developed based on the risk score.
All statistical analyses were conducted using R software, version 3.6.2 (R Foundation for Statistical Computing). A two-sided P value < 0.05 was considered statistically significant.

Basic characteristics of the study population
The flowchart of the study procedure is illustrated in  Table 2 presents the associations of clinical characteristics with the severity of COVID-19 in the training set. For demographic characteristics and clinical symptoms, age, comorbidity, and fever were associated with the severity of COVID-19 (all P values < 0.05). For dichotomous laboratory markers, higher levels of C-reactive protein (CRP), lactate dehydrogenase (LDH), serum amyloid A, fibrinogen (FIB), D-dimer, adenosine deaminase, reduced haemoglobin, and lower levels of lymphocyte, eosinophil, platelet counts, calcium, phosphorus, albumin (ALB), albumin/globulin, prealbumin, total cholesterol, high   Table S1.

Model construction and evaluation
Based on the criteria described in the Methods, 18 candidate markers and 90 patients were selected for the model construction. Because of similar clinical function, D-dimer and FIB were combined into a new variable of coagulation function as DFIB. Abnormal DFIB was defined as patients with abnormal D-dimer or FIB. Electrolyte disturbance was calculated based on the sum of abnormalities in calcium, phosphorus, potassium, sodium and chlorine. Thus, 16 markers were included in LASSO regression for further feature selection. After 1000 resamples by bootstrap, ALB, CRP, LDH, DFIB, comorbidity, lymphocyte count, eosinophil count, and electrolyte disturbance were finally selected as the predictors in the model. The detailed frequency of each marker in the 1000 LASSO models is summarised in Additional file 1: Tables S2 and S3. Table 3 presents the performance of each model in the internal and external validations. For the internal validation, high levels of AUROCs were found among four models of logistic regression, ridge regression, support vector machine, and random forest from 0.919 (95% CI 0.793-0.955) to 0.973 (95% CI 0.935-0.993). For the external validation, the ridge regression model showed the best performance with the highest AUROC of 0.827 (95% CI 0.716-0.921). Therefore, the ridge regression model was considered as the best model because of its high predictive power.
A risk score was then calculated according to the result of the ridge regression model using the following formula: The risk score indicated good discrimination of severe or critical type with an AUROC of 0.897 (95% CI 0.845-0.940). In addition, calibration curve graphically showed good consistency between the predicted and actual probabilities of severe or critical type. Using the optimal cutoff value of 71, the sensitivity of the risk score was 87.1%, and specificity was 78.1% for the COVID-19 severity prediction. Figure 3  . In order to help clinicians to detect the patients who were likely to develop severe or critical COVID-19 at admission, we developed a web-based assessment system based on our risk score. (Fig. 4, Website: http:// www. gtrsp. com: 8011/).

Discussion
Early identification of patients who were likely to develop severe or critical COVID-19 would help reduce the casefatality rate and efficiently utilize the limited medical resources. In this study, we identified a panel of clinical markers associated with the severity of COVID-19 and constructed different severity-prediction models. We found that the ridge regression model was the best based on high AUROCs in both the internal and external validations of 0.930 (95% CI, 0.914-0.943) and 0.827 (95% CI, 0.716-0.921), respectively. Furthermore, we established a risk score and a web-based assessment system to help clinicians to detect the patients who were likely to develop severe or critical COVID-19 at admission.
Previous studies showed that severe or critical COVID-19 patients were older, had more comorbidities, higher levels of LDH, D-dimer, CRP, and lower levels of ALB, lymphocyte count [3][4][5][6]8]. These findings were consistent in our study. Moreover, using the data of 208 patients from Fuyang, Anhui Province, Ji et al. [12] established a scoring model named as CALL to predict the severity of COVID-19. Dong et al. [13] also developed a scoring system with the data of 147 patients from Wuhan, Hubei Province. The AUROCs of their models were 0.910 and 0.843, slightly lower than the AUROC of our assessment model in the internal validation (0.930). However, their models were not validated in an external dataset, leading to the limitation of their generalizability. In contrast, our model validated in an independent dataset and obtained a satisfactory AUROC of 0.827.
Among the eight markers in our model, LDH, CRP, ALB, and lymphocyte count were well-recognized predictors for COVID-19 severity [16]. For eosinophil count, Zhu et al. [14] demonstrated that decreased eosinophils could induce acute lung injury in the mouse model. Liu et al. [15] also found that increased eosinophil count predicted the improvement in COVID-19 progression. Several studies reported that severe or critical COVID-19 patients often experienced electrolyte disturbances [3,   4,16]. In our study, we used the sum of abnormalities in potassium, calcium, sodium, phosphorus and chlorine to comprehensively evaluate the degree of electrolyte disturbances. D-dimer and FIB were indicators of coagulation function. Chen et al. [8] reported that patients infected with SARS-CoV-2 had abnormal coagulation function (hypercoagulation). We combined the two indicators to increase the sensitivity of judging abnormal coagulation and avoid the collinearity of the two markers. Different from other studies, age was not included in our final model. This might be owing to the high correlation between age and comorbidity in the training set, and the LASSO regression identified comorbidity as a more important marker. There were several limitations in our study. First, there were different distribution on the severity of COVID-19 between the training and validation sets. There were no critical cases in training set while without mild cases in validation set. This difference was due to the rule of government on the COVID-19 prevention and control in Zhejiang Province in China. Xixi Hospital (municipal-level hospital for infectious diseases) mainly receive and cure the patients with mild, moderate, and severe COVID-19 (no critical patients), while FAHZJU (provincial-level hospital) is mainly responsible for moderate, severe, and critical patients (no mild patients). The different distribution of the severity might have influences on the model construction and validation. However, even there were these differences, the ideal performance was still obtained in the validation stage, and this result indicated that there was relatively high generalizability in our model. Second, the subjects were mainly recruited from Hangzhou and the sample size was relatively small. This would limit the generalizability of our model. Additional validation from areas outside Zhejiang should be conducted in the future. Third, because of the retrospective study design, some laboratory tests were not done in some patients. Therefore, their associations with the severity of COVID-19 might be OR Odds ratio, CI Confidence interval, BMI Body mass index, WBC White blood cell, CRP C-reactive protein, LDH Lactate dehydrogenase, SAA Serum amyloid A, ALB Albumin, ALB/GLB Albumin/Globulin, TC Total cholesterol, HDL-C High density lipoprotein cholesterol, PALB Prealbumin; RBP Retinol binding protein, Apo A1 Apolipoprotein A1, GLB Globulin, FIB Fibrinogen, ALT Alanine aminotransferase, AST Aspartate aminotransferase, SaO 2 = Oxygen saturation, PaO 2 Partial pressure of oxygen in arterial blood, FiO 2 Inspired oxygen fraction, ADA Adenosine deaminase; PaCO2 Partial pressure of carbon dioxide, γ-GTP γ-glutamyltranspeptidase, IgG Immunoglobulin G, GPDA = Glycyl-proline-dipeptidyl aminopeptidase, FFA Free fatty acids, eGFR Estimated glomerular filtration rate, DBIL Direct bilirubin a Data were presented as median (IQR), or n (%) where appropriate b Comorbidity was defined as having at least one of the following diseases: diabetes, hypertension, cardiovascular disease, severe congenital disease, cancer and chronic liver, renal, respiratory disease