Predicting sepsis onset in ICU using machine learning models: a systematic review and meta-analysis

Background Sepsis is a life-threatening condition caused by an abnormal response of the body to infection and imposes a significant health and economic burden worldwide due to its high mortality rate. Early recognition of sepsis is crucial for effective treatment. This study aimed to systematically evaluate the performance of various machine learning models in predicting the onset of sepsis. Methods We conducted a comprehensive search of the Cochrane Library, PubMed, Embase, and Web of Science databases, covering studies from database inception to November 14, 2022. We used the PROBAST tool to assess the risk of bias. We calculated the predictive performance for sepsis onset using the C-index and accuracy. We followed the PRISMA guidelines for this study. Results We included 23 eligible studies with a total of 4,314,145 patients and 26 different machine learning models. The most frequently used models in the studies were random forest (n = 9), extreme gradient boost (n = 7), and logistic regression (n = 6) models. The random forest (test set n = 9, acc = 0.911) and extreme gradient boost (test set n = 7, acc = 0.957) models were the most accurate based on our analysis of the predictive performance. In terms of the C-index outcome, the random forest (n = 6, acc = 0.79) and extreme gradient boost (n = 7, acc = 0.83) models showed the highest performance. Conclusion Machine learning has proven to be an effective tool for predicting sepsis at an early stage. However, to obtain more accurate results, additional machine learning methods are needed. In our research, we discovered that the XGBoost and random forest models exhibited the best predictive performance and were most frequently utilized for predicting the onset of sepsis. Trial registration CRD42022384015 Supplementary Information The online version contains supplementary material available at 10.1186/s12879-023-08614-0.


Introduction
Sepsis is a severe and potentially life-threatening condition resulting from a dysregulated immune response to infection [1].Early detection and prompt treatment are crucial for improving patient outcomes and reducing health care costs.In recent years, machine learning (ML) models have emerged as promising tools for detecting and managing sepsis in the intensive care unit (ICU) [2].These models use complex algorithms and statistical methods to learn from large volumes of patient data, including vital signs, laboratory results, and electronic health records, and to predict the onset of sepsis before its clinical manifestations become apparent [3].The early identification and treatment of sepsis are related to the improvement of patient prognosis.Machine learningbased warning systems may shorten recognition time.Adams R et al. [4] set up a system called the "Targeted Real-time Early Warning System", and they found that early warning systems have the potential to identify sepsis patients early and improve their prognosis and can identify and prioritize sepsis patients who would benefit the most from early treatment.By enabling early detection, ML models hold tremendous potential for enhancing patient care and reducing the burden of sepsis on health care systems worldwide.
The Sepsis-3 definitions suggest that patients with at least two of the following three clinical variables may be prone to the poor outcomes typical of sepsis: (1) a low blood pressure (SBP ≤ 100 mmHg), (2) a high respiratory rate (≥ 22 breaths per min), or (3) altered mentation (Glasgow coma scale score < 15).Machine learning can utilize computers to review a large number of clinical cases, and mature machine learning models can be used to make real-time evaluations of whether patients will develop sepsis, allowing for immediate intervention.
In this study, we aimed to explore the use of ML models for predicting the onset of sepsis in the ICU.Specifically, we reviewed the literature on ML models for sepsis prediction, highlighting their strengths and limitations.Additionally, in this article, we discuss the potential impact of these models on patient outcomes, clinical decision-making, and health care costs.Through this meta-analysis, we hope to shed light on the promise of ML models as tools for improving the management of sepsis in the ICU and beyond.

Study design and literature search
This study retrieved relevant studies on the timing of sepsis diagnosis by machine learning from the Cochrane Library, Embase, PubMed, and Web of Science databases and extracted data from these studies.The Cochrane Library, Embase, PubMed and Web of Science databases were searched from inception to 14/11/2022.Search formulas were constructed based on combinations of MeSH headings and free words.We did not put any restriction on the language or region.The literature search was completed by Zhenyu Yang and Xiaoju Cui (the search detail is shown in Supplementary file 2).All selected studies were imported to EndNote 2020.We filtered studies according to the abstract.Duplicate articles were deleted.Literature screening was independently performed by two reviewers (Zhenyu Yang and Xiaoju Cui).Any disagreement was settled by a third reviewer.The retrieval formular file is presented in Supplementary material 2.

Inclusion and exclusion criteria
Inclusion criteria.
(1) Randomized controlled trials (RCTs), prospective cohort studies, and nested case-control studies.(2) Studies in which the predictive model was completely established.

Data extraction
The data extraction form was detailed according to the Modified CHARMS checklist.The checklist included the name of the first author, publication date, nationality, duration of data collection, study design, type of validation (internal, external, random split and time split) and sample size (total number of participants, development and testing clusters).

Risk of bias assessment
We used PROBAST and an external prognostic validity model to assess the risk of bias of the selected studies [5].PROBAST is a checklist designed for systemic reviews of diagnostic or prognostic prediction models.The risk of bias was assessed independently by two reviewers (Zhe Song and Zhenyu Yang).PROBAST consists of two parts: A. an overall bias risk assessment (including research objects, predictors, results and statistical methods) and B. an overall applicability assessment (research objects, predictors and results).

Statistical analysis
We performed descriptive statistics to summarize the characteristics of the models.For prediction models that were evaluated in more than two independent datasets, a random effect meta-analysis was conducted to estimate their performance and accuracy.If a measure of uncertainty, such as the standard error or 95% confidence interval, was unavailable for the mean C-index, we computed it based on the number of events and participants.All data analyses were carried out using R software version 4.1.1.

Study selection
A total of 422 articles were identified through various databases, including the Cochrane Library (n = 12), Embase (n = 150), PubMed (n = 74), and Web of Science (n = 186) databases.After eliminating 15 duplicate articles and excluding ineligible records using automation tools, we browsed 387 articles.Ultimately, 23 articles met the inclusion criteria and were included in our study [2,. Figure 1 displays the PRISMA flow diagram illustrating our study selection process.The selection was conducted independently by two reviewers (Zhenyu Yang and Xiaoju Cui).Any discrepancies were resolved by a third reviewer.

Characteristics of included studies
A total of 1,287,160 individuals were included in this study, with 167,338 individuals included in the validation set.All articles analysed were published within the past 5 years, indicating a growing interest in the use of machine learning for sepsis prediction.Our research identified 81 prognostic models, including 5 based on deep learning, 4 based on InSight, 10 based on logistic regression, 6 based on multilayer perceptron, 8 based on neural networks, 8 based on support vector machines, 14 based on XGBoost, 15 based on random forest, and 11 based on SOFA.Detailed characteristics of the included studies can be found in Table 1.

Quality assessment
The quality assessment was conducted independently by two reviewers (Zhenyu Yang and Xiaoju Cui), and     any discrepancies were resolved by a third reviewer.The results of the quality assessment are presented in the risk of bias picture (Fig. 2).Two studies (8.6%) were deemed to have a high risk of bias in the participant domain, 13 studies (58.3%) were deemed to have a high risk of bias in the analysis domain, and two studies (8.6%) were deemed to have a high risk of bias in the outcome domain.No studies were deemed to have a high risk of bias in the predictor domain.A high risk of bias in the analysis domain may be attributed to an inadequate sample size, insufficient events per variable (EPV), improper handling of missing data, or failure to report how missing data were handled.The PRISMA checklist can be found in Supplementary file 1.

Predictors
Age, creatinine levels, and sodium levels were the most frequently used predictors (n = 12), followed by blood pressure and platelet levels (n = 11) and heart rate (n = 9).The remaining predictors were ranked in descending order of frequency as follows: lactate levels and temperature (n = 9), the WBC count (n = 8), the respiratory rate and SOFA score (n = 7), glucose, haemoglobin, MCHC, and PaO2 levels (n = 6), the GCS score, ICU LOS, lymphocyte count, and PaCO2 levels (n = 5), and BUN levels, cancer, and sex (n = 4).These results are presented in Fig. 3.

Training set and test set accuracy
In the training set, the random forest model was the most frequently applied machine learning model (n = 9), with an accuracy of 0.911 (0.485, 0.991).The XGBoost model showed the best predictive performance (n = 6), with an accuracy of 0.970 (0.487, 0.997).
In the test set, the random forest model was also the most frequently applied machine learning model (n = 7), with an accuracy of 0.795 (0.638, 0.895).The deep learning model showed the best predictive performance (n = 3), with an accuracy of 0.830 (0.814, 0.845).These results are presented in Figs. 4, 5, 6, 7  and 8.

Training set and test set c-index
Regarding the c-index results, in the training set, the XGBoost model was the most frequently utilized machine learning model, with a c-index of 0.83 (0.83, 0.84) in 7 studies.The InSight model exhibited the best performance, with a c-index of 0.91 (0.90, 0.93) in 2 studies.On the other hand, in the test set, the random forest model was the most frequently employed machine learning model, with a c-index of 0.83 (0.82,0.83) in 5 studies.
In terms of performance, the random forest model (n = 5, c-index = 0.83 (0.82,0.83)) and XGBoost model (n = 3, c-index = 0.83 (0.82,0.84)) exhibited similar performance.This table provides detailed information on the various studies included in this study Detailed datasets can be found in Figs. 9, 10, 11, 12 and 13, and the overall results are presented in Supplementary file 3.

Discussion
The present study investigated 68 prognostic prediction models across 23 studies to assess the potential of machine learning models for predicting sepsis in the ICU.However, the risk of bias assessment revealed a high risk of bias in the analysis domain, which may be attributed to the small sample size, the processing of missing data, and the interpretation of complex data.Therefore, the research findings may have some deviation due to the insufficient sample size.
Sepsis is a severe medical condition that can cause widespread inflammation and damage to vital organs.
Early detection and treatment of sepsis are critical for improving patient outcomes and reducing health care costs.ML models can analyse large amounts of patient data, including vital signs, laboratory results, and electronic health records, to detect early signs of sepsis.ML algorithms can provide physicians with real-time recommendations for patient treatment and management based on the latest medical knowledge and patient data.The use of ML models for predicting the onset of sepsis in the ICU has the potential to revolutionize the way in which sepsis is detected, treated, and managed, leading to better patient outcomes and reduced health care costs.
Several studies have explored the potential of machine learning algorithms for predicting sepsis.Heather M et al. [28] developed a machine learning algorithm to  predict severe sepsis and septic shock, which can predict, with high specificity, the impending occurrence of severe sepsis and septic shock.Lucas M Fleuren et al. designed a meta-analysis that found that individual machine learning models can accurately predict sepsis onset early, similar to the present study.Nianzong Hou et al. [29] developed an XGBoost model to predict 30-day mortality, which can assist clinicians in tailoring precise management and therapy for patients with sepsis.Dong Wang et al. [13] developed an artificial intelligence algorithm to predict sepsis early, which has shown good predictive ability in Chinese sepsis patients.However, external validation studies are necessary to confirm the universality of this method for the population and in treatment practice.
In this study, we concluded that two machine learning algorithms, the XGBoost and random forest, showed significant advantages in predicting sepsis incidence in ICU patients with higher ACC and c-index values compared to other models in this study, specifically the random forest (test set n = 9, acc = 0.911) and extreme gradient boost (test set n = 7, acc = 0.957) models.Compared to other studies, this study compared all previous machine learning models for predicting sepsis incidence in ICU patients, including 4,314,145 patients and 26 different machine learning models.This was a large, comprehensive study that strictly followed the PRISMA requirements for systematic evaluation and was methodologically rigorous and scientific.Based on this, we believe that our study is more accurate than previous studies.
The XGBoost and random forest are two machine learning algorithms that showed significant advantages compared to other models in the present study.XGBoost is a popular open-source software library for machine learning that is optimized for speed and scalability, making it one of the most efficient gradient boosting algorithms available.It can handle missing data and noisy data, making it a robust solution for real-world data problems.Random forest is a widely used ensemble machine learning algorithm that combines multiple trees to form a forest and produces a final prediction by aggregating the results from all the trees.These algorithms have been applied in various industries, including finance, health care, and marketing, and have won several machine learning competitions [30].In our research, the random forest and XGBoost models showed significant advantages compared to other models.We also found other studies using machine learning to predict the incidence of sepsis.Bloch et al. [31] conducted a study using machine learning to predict the onset of sepsis.They found that the support vector machine (SVM) model had the best performance in predicting the onset of sepsis.Compared with this study, the study conducted by Bloch et al. focused on the data of a single medical centre and did not evaluate the data of other medical centres; therefore, the results can only reflect the situation of their single centre, lacking reference value for other regions.

Conclusion
Machine learning has proven to be an effective tool for predicting sepsis at an early stage.However, to obtain more accurate results, additional machine learning methods are needed.In our research, we discovered that XGBoost and random forest models are the most commonly used models for predicting sepsis incidence in ICU patients, and they exhibit significant performance and accuracy compared to other models.The use of predictive models for early risk assessment has relatively ideal effects in preventing sepsis incidence in ICU patients; however, it still needs further improvement.Therefore, we look forward to more validated machine learning methods based on convenient, noninvasive, or minimally invasive predictive indicators, which may have significant performance and accuracy in predicting sepsis incidence in ICU patients.

Limitations
This study also has some limitations.First, this study focused on the accuracy of machine learning models and did not include risk factors that lead to the high incidence rate of sepsis in ICU patients.Second, some included models contained special variables related to the diagnosis of sepsis (such as infection indicators), which are valuable for further validation and research in subsequent studies.

J
, SSP-GRU, InSight, AISE Evaluation of a machine learning algorithm for up to 48-h advance prediction of sepsis using six vital ,SAPS-II A machine learning-based model for 1-year mortality prediction in patients admitted to an Intensive Care Unit with a diagnosis of sepsis 、XGBoost、DT、RF、NB、LR Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Develop-、CNNLSTM、EASP Effect of a machine learningbased severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial Early diagnosis of blood-stream infections in the intensive care unit using machine-

Fig. 2 Fig. 3
Fig. 2 Risk of Bias Assessment.This figure illustrates the risk bias included in this study

Table 1
Detailed characteristics of the included studies