1 Mentions

Geoffray Agard, Christophe Roman, Christophe Guervilly | Journal of Clinical Medicine | (2025)

Primary Link Pubmed DOI Openalex Full text (OA)

Key Takeaways

Sample Definition And Size

The study was a retrospective cohort analysis using the MIMIC‑IV database (2008–2019). It included invasive mechanical ventilation (MV) episodes longer than 48 hours: 38,750 MV episodes were identified, of which 9,849 episodes (25.4%) involved 7,871 patients. Among these, 452 ventilator‑associated pneumonia (VAP) episodes occurred in 397 patients (4.1% of MV episodes >48 h). Most patients had one VAP episode; a few had multiple (up to four).

Study Type

Retrospective cohort study developing and internally validating a deep learning model (long short‑term memory neural network) for early VAP prediction, compared against traditional machine learning models (random forest, XGBoost, logistic regression).

Conflicts Of Interest

No conflicts of interest are declared in the article.

Results Summary

PREDICT achieved AUPRC values of 96.0%, 94.1%, and 94.7% for predicting VAP 6, 12, and 24 hours before onset, respectively. Sensitivity and positive predictive value (PPV) exceeded 85% across all horizons (e.g., sensitivity 89.7%, PPV 89.8% at 6 h; sensitivity 85.1%, specificity 99.2% at 24 h). AUROC was approximately 99% for all prediction windows. Calibration was strong, with Brier scores of 0.04 (6 h), 0.06 (12 h), and 0.10 (24 h). Integrated gradients analysis identified respiratory rate, SpO₂, and temperature as the most influential predictive features.

Doi

10.3390/jcm14103380

Full Text Open Access

true

Abstract

Background: Ventilator-associated pneumonia (VAP) is a common and serious ICU complication, affecting up to 40% of mechanically ventilated patients. The diagnosis of VAP currently relies on retrospective clinical, radiological, and microbiological criteria, which often delays targeted treatment and promotes the overuse of broad-spectrum antibiotics. The early prediction of VAP is crucial to improve outcomes and guide antimicrobial use related to this disease. This study aimed to develop and validate PREDICT (Pneumonia Risk Evaluation and Diagnostic Intelligence via Computational Technology), a deep learning algorithm for early VAP prediction that is based solely on vital signs. Methods: We conducted a retrospective cohort study using the MIMIC-IV database, which includes ICU patients who were ventilated for at least 48 h. Five vital signs (respiratory rate, SpO2, heart rate, temperature, and mean arterial pressure) were structured into 24 h temporal windows. The PREDICT model, based on a long short-term memory neural network, was trained to predict the onset of VAP 6, 12, and 24 h in the future. Its performance was compared to that of conventional machine learning models (random forest, XGBoost, logistic regression) using their AUPRC, sensitivity, specificity, and predictive values. Results: PREDICT achieved high predictive accuracy with AUPRC values of 96.0%, 94.1%, and 94.7% at 6, 12, and 24 h before the onset of VAP, respectively. Its sensitivity and positive predictive values exceeded 85% across all horizons. Traditional ML models showed a drop in performance over longer timeframes. Analysis of the model’s explainability highlighted the respiratory rate, SpO2, and temperature as key predictive features. Conclusions: PREDICT is the first deep learning model specifically designed for early VAP prediction in ICUs. It represents a promising tool for timely clinical decision-making and improved antibiotic stewardship.

Referenced In

Mercedes C.

4 months ago

AI Papers and Discussion

TLDR: When using AI models to predict ventilator outcomes, is medication a good predictor? Why does performance always seem to be capped around 0.80? Is this limit due to model choice, input data type, or simply irreducible noise when predicting across diverse ICU populations?

Machine Learning for Mechanical Ventilation

⸂⸂⸜(രᴗര๑)⸝⸃⸃ Hey everyone!! 👋 Biomed engineering PhD student here - Always interested in seeing new ways technology might contribute to the healthcare space. Anyways, this study here recently caught my attention, and I’m curious to hear your thoughts.

During COVID-19, mechanical ventilation quickly became a hot topic. First thanks to global ventilator shortages and later due to the less than ideal outcomes from machine ventilation use. High mortality and complications like pneumonia raised an uncomfortable question: are we using mechanical ventilation for the right patients, at the right time, for the right duration?

To answer this question, many researchers have jumped onto the AI boat, leveraging machine learning (ML) to analyze patient data. Several publications have emerged since then, such as:

Xu et al., 2024 used Random Forest to predict weaning difficulty in ventilated ICU patients, achieving 0.805 AUROC with 36 variables.
Agard et al., 2025 : utilised a long short-term memory (LSTM) network. They predicted ventilator-associated pneumonia (VAP) 6 - 24 hours before it became clinically obvious and hit 94 - 96% AUPRC with 5 variables.

A new Pharmacotherapy study, Murray et al., 2026, also made use of the very popular Random Forest and SVM models, but took a different approach in terms of input variables, where medication regimen complexity was used as a predictor of prolonged ventilation (>5 days) in the ICU.

The authors developed a “MRC-ICU” score to systematically quantify medication regimen complexity, assigning weighted values to 35 distinct medication categories, generating a total score.
Their justification came from the fact: 70% of patients in the ICU are receiving more than 13 medications at any given time.
Previously, medication data was rarely highlighted as a primary predictor, as:
1. Medication data was difficult to compile and integrate.
2. Other papers evaluated medication to be non-significant

While the overall accuracy of Murray’s MRC-ICU model (~0.78 AUROC) is not dramatically higher than other existing models, the authors argued that their study differs in scope: it can predict over a broad ICU population rather than a narrowly defined cohort, like traumatic brain injury Prediction of in-hospital mortality in patients on mechanical ventilation post traumatic brain injury: machine learning approach or congestive heart failure Machine Learning-Based Model for Predicting Prolonged Mechanical Ventilation in Patients with Congestive Heart Failure. They suggest that this model: may be best suited for identifying patients unlikely to require prolonged mechanical ventilation as opposed to confidently identifying patients at high risk.

👉 Performance gap: Model choice or Input Variable?

Thus far, many studies (including Xu et al., 2024 and Murray et al., 2026) using the traditional tree-based methods seem to hit a ~0.78-0.80 AUROC ceiling, while Agard's LSTM broke through to over 94% precision with just 5 variables.

At face value, the advantage appears to be temporal awareness. Agard’s model was fed with 24-hour sequences of vital signs, capturing trends, volatility, and timing. But, it also brings forth the question:

Is this performance ceiling a limitation of tree-based architectures or simply a reflection of richer, more granular input data?

At the same time, given the ‘just ok’ performance (~0.78 AUROC) for Murray’s model, I think there are, in fact, more underlying uncertainties:

Is medication complexity genuinely useful and pulling its weight as a predictor?
Is MRC-ICU just a proxy for disease severity, and is it a good one when predicting across diverse ICU populations?
Are Random Forests becoming obsolete with all the new architecture available? Is it time to leave it behind?

(∿°○°)∿ Drop your thoughts! What do you all think?

Read Full Post

Created: Feb 14, 2026

Popular Boards