We sought proof of concept of a Big Data Solution incorporating longitudinal structured and unstructured patient-level data from electronic health records (EHR) to predict graft loss (GL) and mortality. For a quality improvement initiative, GL and mortality prediction models were constructed using baseline and follow-up data (0–90 days posttransplant; structured and unstructured for 1-year models; data up to 1 year for 3-year models) on adult solitary kidney transplant recipients transplanted during 2007–2015 as follows: Model 1: United Network for Organ Sharing (UNOS) data; Model 2: UNOS & Transplant Database (Tx Database) data; Model 3: UNOS, Tx Database & EHR comorbidity data; and Model 4: UNOS, Tx Database, EHR data, Posttransplant trajectory data, and unstructured data. A 10% 3-year GL rate was observed among 891 patients (2007–2015). Layering of data sources improved model performance; Model 1: area under the curve (AUC), 0.66; (95% confidence interval [CI]: 0.60, 0.72); Model 2: AUC, 0.68; (95% CI: 0.61–0.74); Model 3: AUC, 0.72; (95% CI: 0.66–077); Model 4: AUC, 0.84, (95 % CI: 0.79–0.89). One-year GL (AUC, 0.87; Model 4) and 3-year mortality (AUC, 0.84; Model 4) models performed similarly. A Big Data approach significantly adds efficacy to GL and mortality prediction models and is EHR deployable to optimize outcomes.
Big Data techniques allow the incorporation of unstructured text and longitudinal patient-level data in predictive models for graft loss in kidney transplant recipients with significantly improved predictive performance. See Ho et al's editorial on page 595.