Personalized Prognostic Prediction Models for Breast Cancer Recurrence and Survival Incorporating Multidimensional Data
Background: In this study, we developed integrative, personalized prognostic models for breast cancer recurrence and overall survival (OS) that consider receptor subtypes, epidemiological data, quality of life (QoL), and treatment.
Methods: A total of 15 314 women with stage I to III invasive primary breast cancer treated at The University of Texas MD Anderson Cancer Center between 1997 and 2012 were used to generate prognostic models by Cox regression analysis in a two-stage study. Model performance was assessed by calculating the area under the curve (AUC) and calibration analysis and compared with Nottingham Prognostic Index (NPI) and PREDICT.
Results: Host characteristics were assessed for 10 809 women as the discovery population (median follow-up = 6.09 years, 1144 recurrence and 1627 deaths) and 4505 women as the validation population (median follow-up = 7.95 years, 684 recurrence and 1095 deaths). In addition to the known clinical/pathological variables, the model for recurrence included alcohol consumption while the model for OS included smoking status and physical component summary score. The AUCs for recurrence and OS were 0.813 and 0.810 in the discovery and 0.807 and 0.803 in the validation, respectively, compared with AUCs of 0.761 and 0.753 in discovery and 0.777 and 0.751 in validation for NPI. Our model further showed better calibration compared with PREDICT. We also developed race-specific and receptor subtype–specific models with comparable AUCs. Racial disparity was evident in the distributions of many risk factors and clinical presentation of the disease.
Conclusions: Our integrative prognostic models for breast cancer exhibit high discriminatory accuracy and excellent calibration and are the first to incorporate receptor subtype and epidemiological and QoL data.