Breaking News

Prediction algorithm for ICU mortality and length of stay using machine learning | Scientific Reports – Nature.com

SubjectsThis was a retrospective cohort study performed using electronic health record data of consecutive patients admitted to the ICU at Chiba University Hospital, Japan, from November 2010 to March 2019. The surgical/medical ICU has 22 beds, with an annual admission number of patients ranging from 1541 to 1832. Of the 16,169 screened patients, 12,747 were enrolled in the present study after the exclusion of 3,422 with missing data on clinical outcomes.The study was approved by the Ethical Review Board of the Graduate School of Medicine, Chiba University (approval number: 3380), and performed in accordance with the Declaration of Helsinki. The review board waived the need for written informed consent, in conformity with the Ethical Guidelines for Medical and Health Research Involving Human Subjects in Japan.Data collection and definitionsTo develop prediction algorithms, the data of 91 input variables (Supplementary Table S11) were collected at the earliest time within 24 h after ICU admission from the ICU data system. These variables included (1) patient baseline characteristics (age, sex, height, weight, blood type, clinical department categories, diagnosis on admission, admission route [from emergency room, general ward, operating room, other hospitals] and APACHE II comorbidities [acquired immunodeficiency syndrome, acute myeloid leukemia/multiple myeloma, heart failure, lymphoma, respiratory failure, cancer metastasis, liver failure/cirrhosis, immunosuppressed status, and dialysis]); (2) blood tests (complete blood count, biochemistry, coagulation, and blood gas analysis); and (3) physiologic measurements (HR, blood pressure, respiratory rate, peripheral oxygen saturation [SpO2], and body temperature). Numerical data with an input rate of less than 50% were not used for predictions.Variable importance is defined as an index calculated by machine learning that indicates how much the model used the variable to make precise predictions. The top three variables with high importance were defined as the key variables in this study. The length of ICU stay was analyzed in survivors and divided into three categories: short (within 1 week), medium (within 1–2 weeks), and long (more than 2 weeks). The short and long length of ICU stay were considered to have high clinical importance because these subcategories were reported to be associated with ICU mortality and severity14 days) intensive care unit stay. J. Crit. Care. 29, 60–65 (2014).” href=”https://www.nature.com/articles/s41598-022-17091-5#ref-CR16″ id=”ref-link-section-d1535334e2055″>16,19. In addition, identifying patients who are at risk of long ICU stay may contribute to adequate ICU management and avoid ICU bed shortage14 days) intensive care unit stay. J. Crit. Care. 29, 60–65 (2014).” href=”https://www.nature.com/articles/s41598-022-17091-5#ref-CR16″ id=”ref-link-section-d1535334e2062″>16.Imputation for missing valuesWe performed multiple imputations (10 times) for the missing values of numerical data on single dataset using the sklearn.impute.Iterative Imputer in Python (scikit-learn 0.22.1; https://scikit-learn.org). Dummy coding was used to convert categorical variables into binary variables. After missing value imputation, the dataset was randomly split into the training and test cohorts, comprising 80% and 20% of the datasets, respectively, and the variables were compared between the two cohorts.Statistical analysisThe primary outcome variable was ICU mortality, and the secondary outcome variable was the length of ICU stay. Outcome prediction was performed using machine learning approach algorithms computed with the three types of classifiers, namely RF, XGBoost, and Neural Network, or logistic regression analysis using either APACHE II score or SOFA score. RF is a standard ensemble machine learning method, and XG Boost is the same decision tree-based method as RF, which has been used frequently in recent years because of its accuracy for complex data. Different from these two classifiers, Neural Network is a non-decision-tree based method. Because it is difficult to evaluate all machine learning methods, these three classifiers, which are representative and have different characteristics, were selected in this study. After machine learning algorithms were derived using the training cohort, the established algorithms were applied to the test cohort. As we found that the RF was superior to the other two machine learning models for the prediction of mortality, we confirmed the variable importance and key variables in the RF model. To evaluate the variable importance in the prediction, we used the feature importances function in Python scikit-learn package.For robust clustering of ICU patients with higher risk factors for mortality, an RF dissimilarity measure was calculated to evaluate the similarity among patients. The RF dissimilarity measure is a method to evaluate the similarity between samples based on a trained RF model, where the similarity of samples is evaluated by the frequency with which two samples are classified into the same leaf in the decision tree in the RF model20. If two samples are classified into the same leaf in all decision trees, the RF dissimilarity between the two samples is 0 (completely same). Conversely, if two samples are never classified into the same leaf, the RF dissimilarity is 1 (completely different). The more often they are classified into the same leaf, the closer the RF dissimilarity is to 0. The RF dissimilarity was then used as an input for UMAP to provide a 2D representation of the patients in the test cohort. UMAP is a type of manifold learning that allows us to place samples in a two-dimensional space while maintaining the distance (dissimilarity) between the samples21. Subsequently, clustering of ICU patients was obtained by visually identifying the distribution of each variable on the two scaling coordinates of UMAP. The clustering based on the RF dissimilarity measure that we have done in this study is a visualization of a supervised machine learning model. In supervised learning, the prediction results are probabilities, but the details of the prediction, such as “what samples are likely to be wrong in prediction” or “what samples have similar prediction probabilities but have different characteristics” are not explicitly shown. Visualization and clustering based on the RF dissimilarity allows us to reveal the heterogeneity of the population and hard-to-predict samples.To predict the length of ICU stay, we evaluated the short (short vs. not short) and long (long vs. not long) categories using machine learning with RF algorithm and logistic regression analysis using the APACHE II or SOFA scores. In the same manner as the analysis on mortality, variable importance and key variables associated with length of ICU stay prediction were confirmed. We also analyzed the predictive values of length of ICU stay using ordinalForest, which could estimate the predictive values for all three categories of ICU stay at the same time. All classifiers were implemented using Python, except for the ordinalForest, which was executed with R.Data are expressed as median (interquartile range) for continuous values and absolute numbers and percentages for categorical values. The AUC was calculated to evaluate the predictive values. Statistical significance was set at P < 0.05. Analyses were performed using Python packages (sklearn.neural_network.MLPClassifier, sklearn.ensemble.RandomForestClassifier, xgboost, sklearn.linear_model.LogisticRegression) and R package (ordinalForest 2.4.1), to construct machine learning models.Ethics approval and consent to participateThis study was approved by the research ethics committee of the Chiba University Graduate School of Medicine (approval number: 3380), who issued a waiver for written consent for the study because data collection was retrospective.
Source: https://www.nature.com/articles/s41598-022-17091-5