The Simplified Acute Physiology 3 outcome prediction model has a narrow time window for recording physiologic measurements. Our objective was to examine the prevalence and impact of missing physiologic data on the Simplified Acute Physiology 3 model’s performance.Design:
Retrospective analysis of prospectively collected data.Setting:
Sixty-three ICUs in the Swedish Intensive Care Registry.Patients:
Patients admitted during 2011–2014 (n = 107,310).Interventions:
None.Measurements and Main Results:
Model performance was analyzed using the area under the receiver operating curve, scaled Brier’s score, and standardized mortality rate. We used a recalibrated Simplified Acute Physiology 3 model and examined model performance in the original dataset and in a dataset of complete records where missing data were generated (simulated dataset). One or more data were missing in 40.9% of the admissions, more common in survivors and low-risk admissions than in nonsurvivors and high-risk admissions. Discrimination did not decrease with one to two missing variables, but accuracy was highest with no missing data. Calibration was best in the original dataset with a mix of full records and records with some missing values (area under the receiver operating curve was 0.85, scaled Brier 27%, and standardized mortality rate 0.99). With zero, one, and two data missing, the scaled Brier was 31%, 26%, and 21%; area under the receiver operating curve was 0.84, 0.87, and 0.89; and standardized mortality rate was 0.92, 1.05 and 1.10, respectively. Datasets where the missing data were simulated for oxygenation or oxygenation and hydrogen ion concentration together performed worse than datasets with these data originally missing.Conclusions:
There is a coupling between missing physiologic data, admission type, low risk, and survival. Increased loss of physiologic data reduced model performance and will deflate mortality risk, resulting in falsely high standardized mortality rates.