Introduction: Predictive models of adverse outcomes are developed to identify combinations of variables that can be used to identify at-risk patients. Prediction accuracy of the model is assessed by discrimination - how well the model discriminates those experiencing the outcome vs. those who don’t, and calibration - how well the predicted probabilities of the model correspond to the observed probabilities of the outcome. The ultimate test of a model is applying it to a different cohort. Typically, models developed from the same cohort tend to be overfit, that is, they apply well to the cohort on which they were developed, but less well to a different cohort. We show that re-calibration can result in a valid and useful model.
Methods: The Society of Thoracic Surgeons (STS) established the National Adult Cardiac Surgery Database (NCD) in response to inadequately risk-adjusted hospital mortality released by the former Health Care Financing Administration (now the Centers for Medicare and Medicaid Services). Subsequently, the STS published models of several adverse outcomes including in-hospital mortality for patients undergoing coronary artery bypass grafting surgery (CABG). We applied the regression coefficients of the risk factors from their model to a different cohort of CABG patients (patients who underwent CABG at Christiana Care Health System) and assessed discrimination and calibration with the c-index and correlation of observed probabilities to predicted probabilities of in-hospital death. As the model resulted in poor calibration, we then re-estimated the regression coefficients for our cohort by applying penalized logistic regression and calculated discrimination and calibration statistics for the re-calibrated model.
Results: While discrimination was good (0.827), the STS model was not well-calibrated (Figure 1A) when applied to our cohort. When the model was re-calibrated, discrimination and calibration statistics indicated good discrimination (0.835) and calibration (Figure 1B).
Conclusion: When predictive models of adverse outcomes developed on one cohort of patients are applied to a different cohort of patients, the model needs to be re-calibrated to obtain a useful model.