Data Suggest Machine Learning Algorithms Can Accurately Predict C. Diff Infection in Hospitalized Patients

New data published today suggest that several commonly used machine learning algorithms (MLAs) can effectively predict which hospitalized patients will become infected with Clostridiodes difficile (C. diff). The findings, which appear in the American Journal of Infection Control (AJIC), could support infection prevention and early diagnosis, as well as more timely implementation of infection control measures to minimize C. diff spread.

“Our study findings suggest that MLAs could play a significant role in reducing the clinical and economic impact of healthcare-associated infections such as C. diff by providing early predictions of at-risk patients prior to them developing serious complications,” said Jana Hoffman, vice president of science for Dascena, Inc. “These data are consistent with a growing body of evidence that validates artificial intelligence and MLAs as integral components of healthcare management that can improve patient outcomes and assist time-constrained clinicians in providing the best patient care.”

C. diff infection (CDI) is the leading cause of hospital-acquired diarrhea and is associated with significant morbidity, mortality, and healthcare costs. There is currently no gold standard tool to assess individual patients’ risk of acquiring CDI. Hoffman and her colleagues have previously published data which demonstrate that MLAs can predict patients at risk of developing other high-impact HAIs.

For the study published today, the researchers used a database comprising electronic health record (EHR) patient data from more than 700 hospitals nationwide to train and then systematically evaluate three different, classical machine-learning and deep-learning methods. They initially assessed various models of each of these methods to determine whether they could effectively predict CDI among hospitalized patients using early inpatient data, and then used a distinct, external dataset to evaluate the generalizability of the best-performing MLA models.

Results suggest that MLAs can predict CDI with excellent discrimination using just the first six hours of inpatient data. Among the three methods studied, a machine-learning method called XGBoost provided the highest overall accuracy in predicting CDI, despite being the least complex model. XGBoost also demonstrated generalizability by maintaining its predictive performance in an external dataset. The other two methods researchers evaluated, neural networks known as Deep Long Short Term Memory (D-LSTM) and one-dimensional convolutional neural network (1D-CNN), also demonstrated high levels of predictive accuracy, though were less generalizable.

The best-performing XGBoost, D-LSTM and 1D-CNN models used similar features to predict CDI among patients, all of which have previously been identified as risk factors. In this study, age was the leading CDI risk factor, followed by clinical measurements such as sodium, body mass index, white blood cell count, and heart rate; active treatment with antibiotics or proton pump inhibitors; glycated hemoglobin; and race.

“This study supports earlier research suggesting that MLAs provide reliable infection-risk prediction that can empower clinical teams to implement appropriate infection control measures at earlier time points and thereby improve healthcare outcomes,” said Linda Dickey, RN, MPH, CIC, FAPIC, the 2022 APIC president.

Source: Association for Professionals in Infection Control and Epidemiology (APIC)