Comparison of mortality risk evaluation tools efficacy in critically ill COVID-19 patients
BMC Infectious Diseases volume 21, Article number: 1173 (2021)
As the COVID-19 pandemic continues, the number of patients admitted to the intensive care unit (ICU) is still increasing. The aim of our article is to estimate which of the conventional ICU mortality risk scores is the most accurate at predicting mortality in COVID-19 patients and to determine how these scores can be used in combination with the 4C Mortality Score.
This was a retrospective study of critically ill COVID-19 patients treated in tertiary reference COVID-19 hospitals during the year 2020. The 4C Mortality Score was calculated upon admission to the hospital. The Simplified Acute Physiology Score (SAPS) II, Acute Physiology and Chronic Health Evaluation (APACHE) II, and Sequential Organ Failure Assessment (SOFA) scores were calculated upon admission to the ICU. Patients were divided into two groups: ICU survivors and ICU non-survivors.
A total of 249 patients were included in the study, of which 63.1% were male. The average age of all patients was 61.32 ± 13.3 years. The all-cause ICU mortality ratio was 41.4% (n = 103). To determine the accuracy of the ICU mortality risk scores a ROC-AUC analysis was performed. The most accurate scale was the APACHE II, with an AUC value of 0.772 (95% CI 0.714–0.830; p < 0.001). All of the ICU risk scores and 4C Mortality Score were significant mortality predictors in the univariate regression analysis. The multivariate regression analysis was completed to elucidate which of the scores can be used in combination with the independent predictive value. In the final model, the APACHE II and 4C Mortality Score prevailed. For each point increase in the APACHE II, mortality risk increased by 1.155 (OR 1.155, 95% CI 1.085–1.229; p < 0.001), and for each point increase in the 4C Mortality Score, mortality risk increased by 1.191 (OR 1.191, 95% CI 1.086–1.306; p < 0.001), demonstrating the best overall calibration of the model.
The study demonstrated that the APACHE II had the best discrimination of mortality in ICU patients. Both the APACHE II and 4C Mortality Score independently predict mortality risk and can be used concomitantly.
As the SARS-CoV-2 (COVID-19) pandemic has continued in the year 2020, the number of COVID-19 patients admitted to the intensive care unit (ICU) around the world has severely increased. According to several reports, over 20% of patients hospitalized with COVID-19 are admitted to the ICU . Despite working at their maximum capacity and increasing the number of beds and personnel, these services are overstretched, leading to worse clinical outcomes . Having regard for the collapsing health care systems, this might raise the question of triage criteria amendments, since not all patients can or should be admitted to the ICU [3,4,5]. Furthermore, the strategy of admission to the ICU varies in different countries. For example, in Belgium, more patients died in the wards (72%) than in the ICU (28%) . The chances of survival are affected by many variables, including the diagnosis upon admission, the patient’s comorbidities, the severity of organ failure, the patient’s age and the patient’s health status before admission. Therefore, it is critical to triage the patients vigorously, determining which patients have the best chances of successful treatment .
There are several scores used in the ICU to help clinicians estimate the mortality risk of patients. Three of the most common are the Simplified Acute Physiology Score (SAPS) II, Acute Physiology and Chronic Health Evaluation (APACHE) II and Sequential Organ Failure Assessment (SOFA) score. The SOFA score uses clinical parameters and laboratory values, while the SAPS II and APACHE II also include age, history of severe organ failure or chronic disease and type of admission. The APACHE II and SAPS II should be calculated on newly admitted patients, while the SOFA score can be recalculated every 24 h . All of these scales have perfect calibration and discrimination across all ranges of possible values, determining the risk of mortality from 1% to almost up to 100%. However, it is important to note that the APACHE II was originally developed to fit all kinds of ICU populations. Therefore, it might not be as precise when evaluating specific patient groups and individual patients. The SAPS II has been built in an European/North American environment, which is important while evaluating patients on different continents [9, 10]. Furthermore, all these scores are focused on a momentary evaluation and do not give enough attention to the previous state of the patient, both chronic comorbidities and ongoing decompensation. As we have established, none of these scores are perfect in every setting, and none of them is specific to one illness, being less accurate in patients which have a particular disease.
Since the beginning of the COVID-19 pandemic, the patients with the highest mortality were those with chronic diseases, such as arterial hypertension, diabetes, heart failure, obesity and chronic kidney disease . Due to these particularities, COVID-19 merits a risk stratification model of its own. Thus, the 4C Mortality Score was developed in the year 2020 in the middle of the pandemic. Using eight different parameters to evaluate patients, it was originally tested in a population that was fully Caucasian, 43% male, had a mean age of 73 and mortality rate of 32.3% . The 4C Mortality Score is designed to be implemented at the moment of the hospitalization and was not intended to be used when admitting the patient to the ICU. However, it is highly specific to COVID-19, as it encompasses the parameters that are the most critical in this disease (i.e., those that reflect respiratory function and inflammatory processes) and patient demographics and comorbidities. It is likely that the effect of the 4C Mortality Score determinants remain present during treatment of the patients, aiding in deterioration, transfer to the ICU and, sequentially, mortality of the population.
The aim of this study was to estimate which of the conventional ICU mortality risk scores is the most accurate at predicting mortality in COVID-19 patients and to determine how can these scores be used in combination with the 4C Mortality Score.
This was a retrospective study of patients who were admitted to a tertiary referral university hospital in the year of 2020 and tested positive for SARS-CoV-2. Ethical approval was gained from the Regional Research Ethics Committee to conduct the study. Inclusion criteria were as follows: tested positive for SARS-CoV-2 infection, 18 years or older and admitted to the intensive care unit (ICU).
Mortality risk evaluation
The patients were evaluated two times during the study. Upon admission to the hospital, the 4C Mortality Score was calculated. Upon admission to the ICU, the APACHE II, SAPS II and SOFA scores were implemented. All the scores were used as suggested by the creators of the scores [9, 10, 12, 13].
Definition of the outcome and groups in the study
Mortality was set as all-cause mortality in the ICU. All the cases in the study had been either discharged or deceased during collection of the data. Mortality ratios were calculated and standardized according to the recommendations of the authors of the risk evaluation tools. The patients were split into two groups: ICU survivors and ICU non-survivors.
Statistical analysis was carried out by the SPSS statistical software package version 26.0 (IBM/SPSS, Inc., Chicago, IL). Baseline characteristics were defined using descriptive statistics. Categorical variables were stated as an absolute number (n) and a relative frequency (%), and continuous variables were represented as a median (interquartile range) or as a mean (± SD), depending on the normality of the distribution. The normality of distribution was tested by the one sample Kolmogorov–Smirnov test.
Comparison of survivors and non-survivors
To compare the categorical variables, the chi-square test was performed. To compare the continuous variables, the independent samples t-test was used for the normally distributed data and the Mann–Whitney test for the non-parametric data.
Standardized mortality ratio calculation
The standardized mortality ratio (SMR) represents the excess mortality and was calculated using the observed number of lethal cases divided by the predicted number of lethal cases. The observed count was obtained from the study data. The predicted number was obtained by implementing the percentage of mortality risk from all the tools used (APACHE II, SOFA, SAPS II and 4C Mortality Score). The individual values of the risk scores were averaged to represent the study population.
The receiver operating characteristic area under the curves (ROC-AUCs) were examined to identify the accuracy of discrimination of the APACHE II, SAPS II, SOFA and 4C Mortality Score.
To determine the independent predictive value of all the risk scores, these scores were integrated into a forward logistic regression analysis. The Lemeshow–Hosmer goodness-of-fit test was used to evaluate calibration.
A total of 249 patients were included in the study, of which 63.1% were male. The mean age of the patients was 61.32 ± 13.3 years. Most of the patients were aged 50–70 years old (55.4%). The highest mortality was observed in the > 80 years age group (63.6%). The most common comorbidities were obesity (28.9%), hypertension (75.9%), and chronic cardiac disease (46.6%) (Table 1).
SMRs were calculated, revealing several times higher values for the APACHE II (SMR = 2.84), SOFA (SMR = 4.14) and SAPS II (SMR = 4.14). The SMR of the 4C Mortality Score was 1.05, showing a good concordance to the actual mortality rate of the group (Table 1).
The mean values of the mortality risk scores were higher in the ICU non-survivors group: SOFA, 3 [2–5] vs 5 [3–9] (p < 0.001); SAPS II, 21 [16–29] vs 32 [24–41] (p < 0.001); APACHE II, 10 [7–13] vs 15 [11–15] (p < 0.001); and 4C Mortality Score, 8 [6–11] vs 12 [9–15] (p < 0.001) (Table 1).
Moreover, the PaO2/FiO2 ratio was also lower in the ICU non-survivors group: 84 [59.8–146.0] vs 161.5 [80.2–217.8] (p < 0.001). The average length of stay in the ICU of all patients was 9 days and the overall length of stay in the hospital was 17 days (Table 1).
Accuracy of mortality risk scores
To determine their accuracy of discrimination, a ROC-AUC analysis of the ICU mortality risk scores was performed. All the risk scores were good predictors of mortality, generating ROC-AUC values above 0.5. The most accurate scale was the APACHE II, with an AUC value of 0.772 (95% CI 0.714–0.830; p < 0.001). The 4C Mortality Score had an AUC value of 0.754 (95% CI 0.694–0.814; p < 0.001). These results are presented in Fig. 1 and Table 2.
Regression analysis of mortality risk scores
Univariate regression analysis was performed to determine the link between the risk scores and ICU mortality. All of the ICU risk scores and 4C Mortality Score were significant mortality predictors in the analysis, with acceptable calibration. The results are presented in Table 3.
Multivariate regression analysis was performed to elucidate which of the scores can be used together with the independent predictive value. In the final model, the APACHE II and 4C Mortality Score prevailed, with the best fit of the model (χ2 = 4.72; degrees of freedom 8; p > 0.787). For each point increase in the APACHE II, mortality risk increased by 1.155 (OR 1.155, 95% CI 1.085–1.229; p < 0.001), and for each point increase in the 4C Mortality Score, mortality risk increased by 1.191 (OR 1.191, 95% CI 1.086–1.306; p < 0.001). Results are presented in Table 3. The R2 value of the final model was 0.358, suggesting a polymodal origin of the mortality in the ICU, which is supposed to not only be determined by the pre-hospitalization factors.
The main finding of the study is that the APACHE II score was the most accurate and had the best discrimination at predicting mortality risk in COVID-19 patients treated in the ICU. However, the best calibration was observed when the 4C Mortality Score was added to the model. Therefore, the APACHE II and 4C Mortality Score independently predict mortality risk and can be used concomitantly.
One of the main findings of the study is that conventional ICU mortality risk scores perform quite well in COVID-19 patients. In our study, the mean values of the APACHE II, SAPS II and SOFA scores are comparable with other reports. The overall mean APACHE II score in our study was 12, which is comparable to the 12.87 reported from India but lower than the ones reported from Sao Paolo (16.7) and Pakistan (20.84) [14,15,16]. Furthermore, in our study, the APACHE II score prevailed as the most accurate one with the highest AUC of all the scores (0.772) [17, 18]. These results correspond with the report from Sao Paolo (AUC 0.8). However, the overall accuracy is slightly lower than expected and reported in the literature . Furthermore, the SMRs are several times higher than expected in these patients. This can be explained by the pathophysiology of the COVID-19 disease, which affects several organ systems far more extensively (i.e., respiratory and coagulation) in the beginning of the disease. Thus, the conventional scores, even when giving a maximum score in these dimensions, may under evaluate the overall mortality risk of these patients.
Secondly, the 4C Mortality Score fits into the risk prediction model with the APACHE II score for ICU patients. In studies comparing the risk scores developed specifically for COVID-19 patients, the 4C Mortality Score was the most accurate, with AUCs of 0.799 and 0.774, and 0.754 in our study. This score has a more extensive evaluation of the comorbidities of patients than conventional ICU risk scores, which tend to rely on stratifying the on-the-spot evaluation of the organ systems. In our study, the combination of the APACHE II and 4C Mortality Score increased the calibration of the risk determination model in the regression analysis [12, 18]. Despite this, the R2 value of 0.358 in the final model was only satisfactory when predicting the mortality in this group. It is obvious that the clinical course in the ICU and the application of mechanical ventilation, renal replacement therapy and other treatment options are major contributors to the outcome. Thus, further studies should be done either to elucidate more risk factors or to define the key moments in the treatment of this specific population.
It is important to discuss the potential clinical implementation of our findings. In the case of the pandemic and the overload of patients, the most valuable feature of the mortality score is good discriminative performance and pragmatic identification of the patients who are likely to die and will not benefit from treatment. Thus, the 4C Mortality Score is a perfect tool in the emergency department (ED). However, when evaluating the mortality of COVID-19 patients in the ICU, a more precise tool is needed. For example, since the 4C Mortality Score was developed for triage in the ED, a noninvasive oxygenation parameter (SpO2) was chosen to evaluate respiratory function. Pulse oximetric saturation has a good correlation with PaO2 in the range of 80–100%. However, critically ill COVID-19 patients usually stay in the ward until SpO2 levels drop below 80%. The accuracy of SpO2 drops when the SpO2 level is below 80% . Moreover, in the case of the APACHE II score, it has a potentially better evaluation of chronic respiratory failure because of the inclusion of base excess. Furthermore, SpO2 strongly depends on blood flow, pulsatility and microcirculatory disturbances (such as microthrombosis), which can affect the accuracy of SpO2. Moreover, the APACHE II and SAPS II scores include a lot of parameters that are usually normal on admission in COVID-19 patients, but they may deteriorate during the clinical course of these patients, leading to high prognostic value when admitting patients to the ICU. Thus, we conclude that there is a benefit to combine the use of these scores. We suggest using the 4C Mortality Score when admitting patients to the hospital and to consider it when accepting the patient to the ICU, alongside the calculation of the APACHE II score.
This study is retrospective. Therefore, all the assumptions should be regarded as associations rather than as causations. Furthermore, it is important to consider the sample size of our study and keep in mind that, to get more generalizable results, more patients should be included in the study. There is a slight difference in the age of the patients in the study when compared with the literature. The mean age of the patients in our study was 61.32 years, which is an average number compared to several other studies with mean ages of 50, 51.26 and 74 years [14, 16, 18]. However, the proportion of patients over 70 years of age is quite large, almost 30%. Increasing age is known to be associated with higher mortality risk and poor outcome. This and other differences between the population of our study and the studies that validated the risk scores (ethnic, demographic, cultural and economic conditions of the patients) should also be considered.
This was a study with a pragmatic approach, having its design oriented towards clinical decision-making. If a patient in the ED met the criteria for hospitalization, he/she was evaluated with the 4C Mortality Score, independently whether he/she was hospitalized on a general ward or directly to the ICU. If the patient was transferred to the ICU, either directly or later, he/she was evaluated with ICU risk scores (APACHE II, SOFA and SAPS II). Therefore, the study design has a potential for bias, since the patient from the general ward may deteriorate and be transferred in a more grave state as compared to the state he/she was hospitalized and evaluated with the 4C Mortality Score. However, the 4C Mortality Score was not developed to use in patients that are already hospitalized. Thus, to offer a clinician a pragmatic way of risk evaluation, we designed this study with evaluations at two points in time, indeed, in some cases, occurring at the same time, when the patient is being admitted directly to the ICU and, in some cases, at two different points in time.
The main finding of the study is that the APACHE II score was the most accurate and had the best discrimination at predicting mortality risk in COVID-19 patients treated in the ICU. However, the best calibration was observed when the 4C Mortality Score was added to the model. Therefore, the APACHE II score and 4C Mortality Score independently predict mortality risk and can be used concomitantly.
Availability of data and materials
The dataset used during the current study is available from the corresponding author on reasonable request.
Intensive care unit
Coronavirus disease 2019
- SAPS II:
Simplified Acute Physiology Score II
- APACHE II:
Acute Physiology and Chronic Health Evaluation II
Sequential Organ Failure Assessment
Receiver operating characteristic area under the curve
Chronic kidney disease
Chronic obstructive pulmonary disease
Mean arterial pressure
Acute kidney injury
Chang R, Elhusseiny KM, Yeh YC, Sun WZ. COVID-19 ICU and mechanical ventilation patient characteristics and outcomes—a systematic review and meta-analysis. PLoS ONE. 2021;16:e0246318.
Giraud T, Dhainaut JF, Vaxelaire JF, Joseph T, Journois D, Bleichner G, et al. Iatrogenic complications in adult intensive care units: a prospective two-center study. Crit Care Med. 1993;21:40–51.
Phua J, Weng L, Ling L, Egi M, Lim CM, Divatia JV, et al. Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations. Lancet Respir Med. 2020;8:506–17.
Christian MD, Sprung CL, King MA, Dichter JR, Kissoon N, Devereaux AV, et al. Triage. Chest. 2014;146:e61S-e74S.
Haas LEM, de Lange DW, van Dijk D, van Delden JJM. Should we deny ICU admission to the elderly? Ethical considerations in times of COVID-19. Crit Care. 2020;24:321.
van Halem K, Bruyndonckx R, van der Hilst J, Cox J, Driesen P, Opsomer M, et al. Correction to: risk factors for mortality in hospitalized patients with COVID-19 at the start of the pandemic in Belgium: a retrospective cohort study. BMC Infect Dis. 2020;20:956.
Guidet B, de Lange DW, Boumendil A, Leaver S, Watson X, Boulanger C, et al. The contribution of frailty, cognition, activity of daily life and comorbidities on outcome in acutely admitted patients over 80 years in European ICUs: the VIP2 study. Intensive Care Med. 2020;46:57–69.
Lambden S, Laterre PF, Levy MM, Francois B. The SOFA score—development, utility and challenges of accurate assessment in clinical trials. Crit Care. 2019;23:374.
Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13:818–29.
Le Gall JR. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. JAMA. 1993;270:2957.
COVID-ICU Group on behalf of the REVA Network and the COVID-ICU Investigators. Clinical characteristics and day-90 outcomes of 4244 critically ill adults with COVID-19: a prospective cohort study. Intensive Care Med. 2021;47:60–73.
Knight SR, Ho A, Pius R, Buchan I, Carson G, Drake TM, et al. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. BMJ. 2020;370:m3339.
Vincent JL, Moreno R, Takala J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure: on behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 1996;22:707–10.
Chiavone PA, Sens YA. Evaluation of APACHE II system among intensive care patients at a teaching hospital. Sao Paulo Med J. 2003;121:53–7.
Naved SA, Siddiqui S, Khan FH. APACHE-II score correlation with mortality and length of stay in an intensive care unit. J Coll Physicians Surg Pak. 2011;21:4–8.
Gupta R, Arora VK. Performance evaluation of APACHE II score for an Indian patient with respiratory problems. Indian J Med Res. 2004;119:273–82.
Ferrando C, Mellado-Artigas R, Gea A, Arruti E, Aldecoa C, Bordell A, et al. Características, evolución clínica y factores asociados a la mortalidad en UCI de los pacientes críticos infectados por SARS-CoV-2 en España: estudio prospectivo, de cohorte y multicéntrico. Rev Esp Anestesiol Reanim. 2020;67:425–37.
Covino M, De Matteis G, Burzo ML, Russo A, Forte E, Carnicelli A, et al. Predicting in-hospital mortality in COVID-19 older patients with specifically developed scores. J Am Geriatr Soc. 2021;69:37–43.
Falcão ALE, Barros AGA, Bezerra AAM, Ferreira NL, Logato CM, Silva FP, et al. The prognostic accuracy evaluation of SAPS 3, SOFA and APACHE II scores for mortality prediction in the surgical ICU: an external validation study and decision-making analysis. Ann Intensive Care. 2019;9:18.
Perkins GD, McAuley DF, Giles S, Routledge H, Gao F. Do changes in pulse oximeter oxygen saturation predict equivalent changes in arterial oxygen saturation? Crit Care. 2003. https://doi.org/10.1186/cc2339.
Ethics approval and consent to participate
Ethical approval was gained from the Vilnius Regional Research Ethics Committee to conduct the study, Reg. N. 2020/6-1233-718 (part of Lithuanian Bioethics Committee).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Vicka, V., Januskeviciute, E., Miskinyte, S. et al. Comparison of mortality risk evaluation tools efficacy in critically ill COVID-19 patients. BMC Infect Dis 21, 1173 (2021). https://doi.org/10.1186/s12879-021-06866-2